geronimi73 / 3090_shorts

minimal LLM scripts for 24GB VRAM GPUs. training, inference, whatever
27 stars 2 forks source link

Finetune OpenELM Encounter RuntimeError #2

Open elliotthwang opened 2 months ago

elliotthwang commented 2 months ago

RuntimeError Traceback (most recent call last)

in () 4 #).log_code(include_fn=lambda path: path.endswith(".py") or path.endswith(".ipynb")) 5 ----> 6 trainer.train() 14 frames /usr/local/lib/python3.10/dist-packages/torch/nn/functional.py in embedding(input, weight, padding_idx, max_norm, norm_type, scale_grad_by_freq, sparse) 2235 # remove once script supports set_grad_enabled 2236 _no_grad_embedding_renorm_(weight, input, max_norm, norm_type) -> 2237 return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) 2238 2239 RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got torch.cuda.FloatTensor instead (while checking arguments for embedding) Training steps more than 100s encounter RuntimeError Please kind give some advices to dealing with!
geronimi73 commented 2 months ago

a few more details on your setup please, on what device on how are you running which script?

elliotthwang commented 2 months ago

Adopting your nb_finetune-full_OpenELM-450M.ipynb script. training on Colab T4 set bf16=False for saving memory set max_seq_length = 512, dataset guanaco-llama2-chinese-1k

All are the adjustments for running finetune sake. Please help , thanks!

geronimi73 commented 2 months ago

Adopting your nb_finetune-full_OpenELM-450M.ipynb script. training on Colab T4 set bf16=False for saving memory set max_seq_length = 512, dataset guanaco-llama2-chinese-1k

All are the adjustments for running finetune sake. Please help , thanks!

could you share your notebook ?

elliotthwang commented 1 month ago

Please find herewith the notebook: https://github.com/elliotthwang/3090_shorts/blob/main/Apple_OpenELM_450M_finetune_full%20(1).ipynb