Open bitterspeed opened 6 months ago
cc: @fxmarty @echarlaix @JingyaHuang
@ucalyptus2 @fxmarty @echarlaix @JingyaHuang any update on this?
The exact same issue occurs with utter-project/EuroLLM-1.7B:
python -m scripts.convert --quantize --model_id utter-project/EuroLLM-1.7B --task text-generation-with-past
RuntimeError: Sizes of tensors must match except in dimension 2. Expected size 16 but got size 8 for tensor number 1 in the list.
System Info
Who can help?
@michaelbenayoun
Hi all, I'm attempting to convert Llama-3 to ONNX format using transformers.js
Upon running this script,
python convert.py --quantize --model_id meta-llama/Meta-Llama-3-8B-Instruct
in - I get this error, any ideas?:Issue here. Xenova says "Looks like an issue with dummy input values due to the adoption of grouped query attention"
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction (minimal, reproducible, runnable)
cd scripts
pip install -r requirements.txt
export HF_TOKEN='....'
python convert.py --quantize --model_id meta-llama/Meta-Llama-3-8B-Instruct
Expected behavior
ONNX conversion to complete.