SafeAILab / EAGLE

Official Implementation of EAGLE-1 (ICML'24) and EAGLE-2 (EMNLP'24)
https://arxiv.org/pdf/2406.16858
Apache License 2.0
780 stars 79 forks source link

new model #66

Open shunxing12345 opened 5 months ago

shunxing12345 commented 5 months ago

Hi i want to add a model that has a different architecture from the LLaMA model. BUT when I was trying accelerate launch -m --mixed_precision=bf16 eagle.train.main --tmpdir [path of data]\ --cpdir [path of checkpoints] -- configpath [path of config file] I got the following ERROR image

Liyuhui-12 commented 5 months ago

It seems that the name of the embedding in your model is not 'embed_tokens'. You can modify it to the name of the embedding layer in your model.

shunxing12345 commented 5 months ago

Thanks for your replay! I got an other problem I am trying to train an LLM which structure differs from LLaMA and Mixtral, should I change the code of cnet.py? It seems based on LLaMA

Liyuhui-12 commented 5 months ago

This is not necessary; EAGLE's structure is independent of the target model. You can use the same cnet.py, or you can try other structures as well.

shunxing12345 commented 4 months ago

Thanks! I have a finetuned a 12B model, but I got the OOM ERROR in model, head, optimizer, train_loader, test_loader, scheduler = accelerator.prepare( model, head, optimizer, train_loader, test_loader, scheduler. I have 8 40G-A100. image this is my train_config image this is my config.json image

Liyuhui-12 commented 4 months ago

I noticed that your "n_layers" is set to 38, which makes your draft model very large. In EAGLE, the draft model consists of only one layer.

shunxing12345 commented 4 months ago

Hi, I have successfully trained an Auto-regression Head, but I encountered the following error during inference. https://github.com/SafeAILab/EAGLE/blob/main/eagle/modeling_eagle.py#L957 image and here is the size of Tensor image