Closed eldarkurtic closed 4 months ago
I am not able to reproduce this. The yaml you linked (and other finetuning runs) work fine for me. Could you please provide more information?
Closing due to inactivity. We are regularly finetuning models without issue, but please let us know if this is persistent!
Hi,
I think there is something weird going on with finetuning flow in the nightly version of llm-foundry. Trying to reproduce a finetuning run from any of the examples available in the repo (e.g. https://github.com/mosaicml/llm-foundry/blob/main/scripts/train/yamls/finetune/7b_dolly_sft.yaml) fails with:
If helpful, finetuning works just fine if I manually convert the finetuning dataset into StreamingDataset format and then load it as such. But this seems a bit inconvenient to do every time when testing out new datasets. Pulling them just from HF-hub and tokenizing on the fly was a super useful feature in llm-foundry.