mosaicml / llm-foundry

LLM training code for Databricks foundation models
https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm
Apache License 2.0
3.83k stars 502 forks source link

Unable to use self developed pre-trained model for finetuning in MosaicML #1291

Closed sauravgrd closed 5 days ago

sauravgrd commented 2 weeks ago

❓ Question

I have developed a pretrained model on my own dataset using custom Attention(without using MosaicML llmfoundary). But when i try to finetune the pre-trained model via mosaicML by giving my model's local or google cloud storage(in the finetuning config yaml)), MosaicML downloads the model artifact(which it considers appropriate) from HuggingFace rather than using my model location Wanted to check how can i use my model's artifact while finetuning?

Additional context

Note: saved my pretrained model using torch.save()

dakinggg commented 1 week ago

LLM Foundry supports either MPT models, or models that are on huggingface. if you come with an unknown architecture, LLM Foundry will not be able to train it. Please provide more information if I misunderstand your question.