apple / corenet

CoreNet: A library for training deep neural networks
Other
6.93k stars 539 forks source link

How to Load OpenELM Pre-training Checkpoints using Hugging Face AutoModelForCausalLM ? #43

Open jasonkrone opened 1 month ago

jasonkrone commented 1 month ago

Hi there,

First, really admire the work on OpenELM! Thank you for making your models and code available.

Question regarding the pre-training checkpoints linked here: how can we convert these checkpoints into the format expected by AutoModelForCausalLM.from_pretrained?

I presume there's a script that was used for conversion of the final model weights into HF format, but I couldn't find it in the repo.

Would very much appreciate any help on this!

Best, Jason

a154377713 commented 4 weeks ago

I have also encountered the same problem. Do you have a solution?

jasonkrone commented 3 weeks ago

I didn't wind up solving this but here's a reference that might be helpful https://github.com/foundation-model-stack/foundation-model-stack/blob/4349dacef63e86b6c1acdccb69b48fe562365bb2/fms/models/llama.py#L592