openvinotoolkit / openvino_notebooks

📚 Jupyter notebook tutorials for OpenVINO™
Apache License 2.0
2.19k stars 762 forks source link

LLM-chatbot llama3 model issue #2133

Closed ktjylsj closed 4 weeks ago

ktjylsj commented 4 weeks ago

Describe the bug There are some issues converting the llama3 model to int4 quantization and running the int8 quantized model.

Expected behavior A clear and concise description of what you expected to happen.

Screenshots Here's the error log. Export command:

optimum-cli export openvino --model meta-llama/Meta-Llama-3-8B-Instruct --task text-generation-with-past --weight-format int4 --group-size 128 --ratio 0.8 --sym --awq --dataset wikitext2 --num-samples 128 llama-3-8b-instruct/INT4_compressed_weights

Framework not specified. Using pt to export the model. Loading checkpoint shards: 100%|██████████████████| 4/4 [00:01<00:00, 3.36it/s] Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Using framework PyTorch: 2.3.1+cpu Overriding 1 configuration item(s)

Here's the error log for running int8 model.

Export command:

optimum-cli export openvino --model meta-llama/Meta-Llama-3-8B-Instruct --task text-generation-with-past --weight-format int4 --group-size 128 --ratio 0.8 --sym --awq --dataset wikitext2 --num-samples 128 llama-3-8b-instruct/INT4_compressed_weights

Framework not specified. Using pt to export the model. Loading checkpoint shards: 100%|██████████████████| 4/4 [00:01<00:00, 3.36it/s] Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Using framework PyTorch: 2.3.1+cpu Overriding 1 configuration item(s)

Environment information Please run python check_install.py in the _openvinonotebooks directory. If the output is NOT OK for any of the checks, please follow the instructions to fix that. If that does not work, or if you still encounter the issue, please paste the output of check_install.py here.

Additional context Add any other context about the problem here.

ktjylsj commented 4 weeks ago

It was ov/ov-dev version issue.