pytorch / torchtune

PyTorch native finetuning library
https://pytorch.org/torchtune/main/
BSD 3-Clause "New" or "Revised" License
4.38k stars 446 forks source link

OpenCoder, Request for adding a model #2058

Open insop opened 4 days ago

insop commented 4 days ago

The OpenCoder team has released OpenCoder 1.5B and 8B models. They seem very promising. Requesting Team to add this model in torchtune.

Thank you!

ebsmothers commented 3 days ago

Hi @insop thanks for creating the issue. Given the model is still relatively new we would like to wait and see a bit before onboarding it as part of our core offering. Fortunately that shouldn't stop you from being able to finetune it with torchtune. We encourage folks to plug in custom components, and for this model it should be relatively easy to do so. Since the architecture is the same as Llama you should be able to do the following:

from torchtune.models.llama3._component_builders import llama3

opencoder_8b = llama3(
    vocab_size=96640,
    num_layers=32,
    num_heads=32,
    num_kv_heads=8,
    embed_dim=4096,
    max_seq_len=8192,
    rope_base=500000.0,
    intermediate_dim=14336,
)

The tokenizer I will need to look at a bit more closely, but given that it appears to use SentencePiece with some additional preprocessing I suspect it should be a small modification off of our Llama2Tokenizer. But happy to provide more detailed pointers to help you getting started here.

insop commented 3 days ago

Hi @ebsmothers

Thank you so much for how I could approach, it makes sense on when to bring the model in.

I am new to torchtune, so anypointer will be helpful and appreciated. Please do let me know if you have any pointers on the tokenizer or brining new model related PR.

Thank you,