Thanks for doing this. Information-overload is hitting the ML research scene this year, and the community needs this kind of thing.
Observations / concerns / suggestions:
For how long will this paper be relevant? I see 5 revisions. Are the authors planning to keep this up-to-date as an overview/reference?
The connection between the paper and the repo isn't obvious. Maybe explain that at the top of the README.md.
It's a lot of work to maintain. If ML researchers feel a confidence it will be kept up-to-date they'll bookmark it more. If you plan to support the resource into the future, suggest you state the plan / mission-statement at the top.
How about some way for other researchers to muck in and distribute the load/burden/effort? e.g. There's a tiny section in the paper: "H: Libraries" that doesn't really help one make a decision what tech to use. It's just a bunch of hyperlinks.
Are you trying to be an encyclopaedia or an engineers' handbook? (I hope the latter; knowledge distillation helps many, accretion not really)
Maybe consider putting the paper source in the repo, and allowing contributions?
Would be nice to crosslink to similar amalgam-papers, e.g. Maybe there is an equivalent for StableDiffusion/NormalizingFlows/ConsistencyModels, maybe there's even a one-level-up nexus where one can see listed summaries of various active fields in ML. If you can crosslink effectively / embed into the information network, that'll strengthen the work.
We will keep it up to date according to the latest literature and probably a few more versions until we fix it
The purpose is quick references and may be other helpful material in the near future. Will add the purpose as required
Yes, we welcome PRs from the community to improve this repo
It's just a brief overview of libraries widely used to train LLMs. We will add more details in the future.
An overview of research in LLMs.
I think research papers are not shared with the community where other people contribute to the paper. Yes, some people do contribute and get the acknowledgment in the acknowledgment section, but I don't think many people would be interested in contributing without being the paper's author (a list that is already fixed)
Thanks for the suggestion, we will see that
Overall, thanks for your comments. In case, you have any questions or queries feel free to leave a comment.
I've been looking at the paper.
Thanks for doing this. Information-overload is hitting the ML research scene this year, and the community needs this kind of thing.
Observations / concerns / suggestions: