Open dante3112 opened 1 month ago
were you able to solve it? i m getting same issue with a base model.
@milsun nope, I tried the unsloth/Qwen2.5-Coder-7B-Instruct-bnb-4bit
model as well but still getting the same error
I am getting similar error too (using unsloth/Meta-Llama-3.1-8B-Instruct)
@danielhanchen can you please help us out with this
I had the same on 14B and the only way I have found is to add "embed_tokens" and "lm_head" to the target modules like
so: target_modules = ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj", "embed_tokens", "lm_head",]
of course this will require more time and more vram but at least you will be able to run the trainer
Extreme apologies on the delay everyone - sorry!
@dante3112 @WasamiKirua I managed to fix the Instruct model issue - please update Unsloth via
pip uninstall unsloth -y
pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
@milsun I think the base model should be fine now @paraschopra Most likely some sort of chat template issue - ie some random tokens in the chat template are untrained
@danielhanchen This didn't fix the issue. I'm trying to train Mistral-12B-NeMo-Instruct.
It worked when I trained my first LoRA, but is now failing on my second dataset.
@danielhanchen still getting the same error for "unsloth/Qwen2.5-Coder-7B-Instruct-bnb-4bit"
but training has started for"unsloth/Qwen2.5-Coder-7B-Instruct"
:( Ok will re-investigate - sorry on the issue
Still running into this issue
unsloth 2024.10.7 unsloth_zoo 2024.11.0
Yes, it seems to be a chat template issue. I managed to get training working by removing tools calls.
This is the one I use:
"chat_template": "{%- if messages[0]['role'] == 'system' %}\n {{- '<|im_start|>system\\n' + messages[0]['content'] + '<|im_end|>\\n' }}\n{%- else %}\n {{- '<|im_start|>system\\nYou are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>\\n' }}\n{%- endif %}\n{%- for message in messages %}\n {%- if message.role == \"user\" or (message.role == \"system\" and not loop.first) or (message.role == \"assistant\" and not message.tool_calls) %}\n {{- '<|im_start|>' + message.role + '\\n' + message.content + '<|im_end|>' + '\\n' }}\n {%- elif message.role == \"assistant\" %}\n {{- '<|im_start|>' + message.role }}\n {%- if message.content %}\n {{- '\\n' + message.content }}\n {%- endif %}\n {{- '<|im_end|>\\n' }}\n {%- elif message.role == \"tool\" %}\n {%- if loop.index0 == 0 or messages[loop.index0 - 1].role != \"tool\" %}\n {{- '<|im_start|>user' }}\n {%- endif %}\n {{- '\\n<tool_response>\\n' + message.content + '\\n</tool_response>' }}\n {%- if loop.last or messages[loop.index0 + 1].role != \"tool\" %}\n {{- '<|im_end|>\\n' }}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\\n' }}\n{%- endif %}\n",
Edit: When I say "working", I mean "running". No verification of results, yet. I'll probably do a test run later today.
Edit2: Results and settings can be found here: https://huggingface.co/neph1/Qwen2.5-Coder-7B-Instruct-Unity
I am trying to finetune Qwen-2.5 Coder-7B-Instruct on my custom dataset but am getting the following error:
ValueError: Unsloth: Untrained tokens of [[]] found, but embed_tokens & lm_head not trainable, causing NaNs. Restart then add `embed_tokens` & `lm_head` to `FastLanguageModel.get_peft_model(target_modules = [..., "embed_tokens", "lm_head",]). `Are you using the `base` model? Instead, use the `instruct` version to silence this warning.
I am getting this error with Qwen-2.5 Coder-7B (base) & Qwen-2.5 Coder-7B-Instruct model while Mistral-Nemo-Instruct-2407-bnb-4bit is working fine, I have updated the unsloth library as well.
Any workarounds for this? and why is this occurring?
my parameters: