Closed NouamaneTazi closed 8 months ago
Steps to use your custom model in nanotron
MistralConfig
config_tiny_mistral.py
MistralForTraining
modeling_mistral.py
DistributedTrainer
run_train.py
trainer = DistributedTrainer(config_file, model_class=MistralForTraining, model_config_class=MistralConfig)
Simple and nice! looks good to merge
Steps to use your custom model in nanotron
MistralConfig
class inconfig_tiny_mistral.py
to match your model's configurationMistralForTraining
class inmodeling_mistral.py
to match your model's architectureDistributedTrainer
class inrun_train.py
: