Closed WilliamsToTo closed 3 weeks ago
@WilliamsToTo which OLMo model are you using. Note that some of them end with -hf
, so please be specific.
The current installation requirements don't really support OLMo, which is being updated soon in #151. It would be wonderful if you could try this pr :)
I downloaded allenai/OLMo-7B-Instruct
from hugging face. What is the difference between allenai/OLMo-7B-Instruct-hf
and allenai/OLMo-7B-Instruct
?
TLDR the -hf versions are natively compatible with HuggingFace, see https://github.com/huggingface/transformers/pull/29890. In the future, all will be compatible. This was a lesson for our first new models. @WilliamsToTo
Try this: https://huggingface.co/allenai/OLMo-7B-Instruct-hf
Got it. Thanks a lot. I'm using https://huggingface.co/allenai/OLMo-7B-Instruct-hf. It supports flash attention.
When I use Lora to finetune olmo-7b-instruct using finetune_lora_with_accelerate.sh. It reports the below information.
I tried to convert
use_flash_attention_2=True if args.use_flash_attn else False
toflash_attention=True if args.use_flash_attn else False
as mentioned at https://github.com/allenai/OLMo/issues/557#issuecomment-2078369571. But it does not work.Do you know how to fix it?