Should we migrate the starter kit from lit-llama to lit-gpt

bkowshik commented 1 year ago

Ref: https://github.com/Lightning-AI/lit-llama

The open-source code in this repository works with the original LLaMA weights that are distributed by Meta under a research-only license.

New Apache 2.0 licensed weights are being released as part of the Open LLaMA project. To use the Open LLaMA weights or other LLaMA-like checkpoints such as Vicuna, check out the Lit-GPT repository.

msaroufim commented 1 year ago

We'd be very open to that, if you'd like to send a PR would be happy to review

carmocca commented 1 year ago

Hi! I work on Lit-GPT.

I would encourage that you switch mainly because:

It supports all the LLaMA-like checkpoints that have permissive licenses, whereas Lit-LLaMA focuses on the original facebookresearch checkpoint which isn't as useful because the rules state that you must use it without pirated weights. Many users of Lit-LLaMA complained that they never got access to the research-only checkpoint. Unless there's a mechanism that gives reliable access to participants of this challenge, it would be better to start off with one of the open LLaMA-likes.
It supports the following base models which are in the rules: Falcon, LLaMA-likes (OpenLLaMA, Vicuna, LongChat, ...), Red Pajama and Pythia. (find the list in our README)
It's easier to add support new configurations/checkpoints.
It contains several efficiency improvements that will be useful to win this challenge and haven't been ported to Lit-LLaMA.

For organizers or participants, feel free to reach out in the Lit-GPT issue page if you have questions about the code or want to share something.

Good luck hacking!

bkowshik commented 1 year ago

Made a Jupyter notebook on Kaggle to get open-llama 3B working based on the documentation and some discussions Ex: https://github.com/Lightning-AI/lit-gpt/issues/254#issuecomment-1632473883 in the repository. Now that we have a working setup of the open-llama 3B model, next steps here would be to make the relevant changes in the toy-example to reflect the same.

Link to the notebook is: https://www.kaggle.com/bkowshik/lightning-ai-lit-gpt

llm-efficiency-challenge / neurips_llm_efficiency_challenge

Should we migrate the starter kit from lit-llama to lit-gpt #6