Open aoke79 opened 1 month ago
pip-list.txt attach the pip list FYI. thanks
Can anyone please take a look at this issue? thanks
--task is incorrect for optimum-cli. Try text-generation-with-past or don't specify it at all.
if removed --task text-generation, will show below comments:
optimum-cli export openvino -m Meta--Llama-2-7b-chat-hf --weight-format int4 ov--Llama-2-7b-chat-hf-int4
Traceback (most recent call last):
File "
it worked for --task text-generation-with-past, like below:
INFO:nncf:Statistics of the bitwidth distribution:
+----------------+-----------------------------+----------------------------------------+
| Num bits (N) | % all parameters (layers) | % ratio-defining parameters (layers) |
+================+=============================+========================================+
| 8 | 4% (2 / 226) | 0% (0 / 224) |
+----------------+-----------------------------+----------------------------------------+
| 4 | 96% (224 / 226) | 100% (224 / 224) |
+----------------+-----------------------------+----------------------------------------+
Applying Weight Compression ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 226/226 • 0:03:17 • 0:00:00
Set tokenizer padding side to left for text-generation-with-past
task.
BTW: how can I know which parameter used for which models? Thanks a lot
I used the new generated the model, "benchmark_genai" still do not work on that.
python benchmark_genai.py -m C:\AIGC\hf\llama2_7b_chat_ov_int4_default_24_3 -p "why the Sun is yellow?" -nw 1 -n 1 -mt 200 -d NPU
Traceback (most recent call last):
File "C:\AIGC\openvino\openvino.genai\samples\python\benchmark_genai\benchmark_genai.py", line 49, in
Thanks,
Hi @aoke79 the problem should be fixed already, please update packages:
pip uninstall openvino openvino-tokenizers openvino-genai
pip install --pre openvino openvino-tokenizers openvino-genai --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly
Dears, I failed to run Llama-2-7b-chat-hf on NPU, please give me a hand.
(env_ov_genai) c:\AIGC\openvino\openvino.genai\samples\python\beam_search_causal_lm>python beam_search_causal_lm.py c:\AIGC\hf\ov--Llama-2-7b-chat-hf-int4-sym-g128 "why the Sun is yellow?" Traceback (most recent call last): File "c:\AIGC\openvino\openvino.genai\samples\python\beam_search_causal_lm\beam_search_causal_lm.py", line 29, in
main()
File "c:\AIGC\openvino\openvino.genai\samples\python\beam_search_causal_lm\beam_search_causal_lm.py", line 24, in main
beams = pipe.generate(args.prompts, config)
RuntimeError: Exception from src\inference\src\cpp\infer_request.cpp:79:
Check '::getPort(port, name, {_impl->get_inputs(), _impl->get_outputs()})' failed at src\inference\src\cpp\infer_request.cpp:79:
Port for tensor name beam_idx was not found.
(env_ov_genai) c:\AIGC\openvino\openvino.genai\samples\python\benchmark_genai>python benchmark_genai.py -m c:\AIGC\openvino\models\TinyLlama-1.1B-Chat-v1.0\OV_FP16-4BIT_DEFAULT -p "why the Sun is yellow?" -nw 1 -n 1 -mt 200 -d NPU Traceback (most recent call last): File "c:\AIGC\openvino\openvino.genai\samples\python\benchmark_genai\benchmark_genai.py", line 49, in
main()
File "c:\AIGC\openvino\openvino.genai\samples\python\benchmark_genai\benchmark_genai.py", line 32, in main
pipe.generate(prompt, config)
RuntimeError: Exception from C:\Jenkins\workspace\private-ci\ie\build-windows-vs2019\b\repos\openvino.genai\src\cpp\src\llm_pipeline_static.cpp:206:
Currently only batch size=1 is supported
(env_ov_genai) c:\AIGC\openvino\openvino.genai\samples\python>python chat_sample.py c:\AIGC\hf\ov--Llama-2-7b-chat-hf-int4-sym-g128 Traceback (most recent call last): File "c:\AIGC\openvino\openvino.genai\samples\python\chat_sample.py", line 43, in
main()
File "c:\AIGC\openvino\openvino.genai\samples\python\chat_sample.py", line 22, in main
pipe = openvino_genai.LLMPipeline(args.model_dir, device)
RuntimeError: Exception from src\core\src\pass\stateful_to_stateless.cpp:128:
Stateful models without
beam_idx
input are not supported in StatefulToStateless transformationI'm not sure if I converted the correct model, so I generated two models like above command line, but neither of them worked. might you please show me how to do that? Thanks a lot