mosaicml / llm-foundry

LLM training code for Databricks foundation models
https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm
Apache License 2.0
4.07k stars 530 forks source link

Finetuning Models #562

Open ak2028 opened 1 year ago

ak2028 commented 1 year ago

I followed the tutorial at train/finetune_example/mpt-7b-arc-easy--gpu.yaml and added an additional evaluation using icl_tasks: 'eval/yamls/tasks_light.yaml' in order to evaluate accuracy on ARC Easy. As the model finetuned, training loss decreased, but so did accuracy, which appears to be a bug.

I repeated this using the full ARC Easy training set and the same thing occurred. Is there a reason that finetuning causes training loss to decrease but accuracy on evaluation to decrease?

samhavens commented 1 year ago

When you used all of ARC easy, can you share what changes you made to the YAML?

ak2028 commented 1 year ago

Sure, I only changed: data_dir: train/finetune_example/arc-easy/ In arc-easy I have a train.jsonl

I downloaded the data from: https://huggingface.co/datasets/ai2_arc