epfLLM / meditron

Meditron is a suite of open-source medical Large Language Models (LLMs).
https://huggingface.co/epfl-llm
Apache License 2.0
1.85k stars 169 forks source link

Can't run finetuning script (wrong paths?) #31

Open th789 opened 8 months ago

th789 commented 8 months ago

Hello Meditron team,

Thank you so much for sharing your work! I'd like to follow your instructions to fine-tune the meditron model, but I get an error (potentially due to wrong paths). Specifically, I perform the following:

  1. Navigate in the meditron folder: cd path/meditron
  2. Run the script: python finetuning/sft.py --checkpoint=meditron --size=7 --run_name=pubmedqa --data bigbio/pubmedqa

But, I get the following error:

python finetuning/sft.py --checkpoint=meditron --size=7 --run_name=pubmedqa --data bigbio/pubmedqa
Tokenizing data!
Traceback (most recent call last):
  File "/n/home07/than157/desktop/llm-med/meditron/Megatron-LLM/tools/preprocess_instruct_data.py", line 28, in <module>
    from megatron.tokenizer import build_tokenizer
ModuleNotFoundError: No module named 'megatron.tokenizer'
Traceback (most recent call last):
  File "/n/home07/than157/desktop/llm-med/meditron/finetuning/sft.py", line 268, in <module>
    main(args)
  File "/n/home07/than157/desktop/llm-med/meditron/finetuning/sft.py", line 206, in main
    data_prefix = tokenize_data(
  File "/n/home07/than157/desktop/llm-med/meditron/finetuning/sft.py", line 85, in tokenize_data
    execute(cmd)
  File "/n/home07/than157/desktop/llm-med/meditron/finetuning/sft.py", line 41, in execute
    assert proc.wait() == 0
AssertionError

I've spent hours trying to figure out the right paths, but to no avail. I would be so grateful if you could help me with the following so I can run your script:

1) How to fix the error above? 2) How should I set CHECKPOINTS in sft.py to finetune the meditron-7b model that I downloaded from huggingface?

Thank you very much!