Open p-wysocki opened 9 months ago
@Wovchena
1) beam_search_causal_lm
: Working fine
2) benchmark_genai
: Working fine
3) chat_sample
: Working fine
4) continuous_batching_accuracy
:
5) continuous_batching_benchmark
:
6) greedy_causal_lm
: Working fine
7) lora_greedy_causal_lm
:
8) multinomial_causal_lm
: Working fine
9) prompt_lookup_decoding_lm
: Working fine
10) speculative_decoding_lm
: Working fine
As you can see except continuous_batching_accuracy
, continuous_batching_benchmark
and lora_greedy_causal_lm
is raising error except them all are working fine with chatglm3-6b
.
How can I fix the errors for 4, 5 and 7.
Can you share the build commands?
I see on your screenshot that the error comes from openvino_genai...
which is usually the name for a prebuilt GenAI. It also states that the used version is 24.4. While it's the latest released version, it's an outdated version from the development point of view. I encourage you to compile the whole project on your own following https://github.com/openvinotoolkit/openvino.genai/blob/master/src/docs/BUILD.md
@Wovchena I built openvino with openvino genai using the following commands sequentially:
git clone --recursive https://github.com/openvinotoolkit/openvino.git
git clone --recursive https://github.com/openvinotoolkit/openvino.genai.git
cd openvino
sudo ./install_build_dependencies.sh
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
cmake --build . --parallel 14
cd --
cmake --install openvino/build --prefix openvino_install
source openvino_install/setupvars.sh
cd openvino.genai
cmake -DCMAKE_BUILD_TYPE=Release -S ./ -B ./build/
cmake --build ./build/ --config Release --parallel 14
cmake --install ./build/ --config Release --prefix openvino_install
cd openvino_install/samples/cpp
./build_samples.sh
cd --
cd openvino_cpp_samples_build/intel64/Release/
1) beam_search_causal_lm: ./beam_search_causal_lm /home/roy/chatglm3-6b-with-past "Why sun is yellow?"
2) benchmark_genai: ./benchmark_genai -m /home/roy/chatglm3-6b-with-past
working fine.
3) chat_sample: ./chat_sample /home/roy/chatglm3-6b-with-past
working fine.
4) greedy_causal_lm: ./greedy_causal_lm /home/roy/chatglm3-6b-with-past "Why sun is yellow?"
working fine.
5) lora_greedy_causal_lm: ./lora_greedy_causal_lm /home/roy/chatglm3-6b-with-past /home/roy/.cache/huggingface/hub/models--THUDM--chatglm3-6b/snapshots/91a0561caa089280e94bf26a9fc3530482f0fe60/model-00001-of-00007.safetensors "Why sun is yellow?"
working fine.
6) multinomial_causal_lm: ./multinomial_causal_lm /home/roy/chatglm3-6b-with-past "Why sun is yellow?"
working fine.
7) prompt_lookup_decoding_lm: ./prompt_lookup_decoding_lm /home/roy/chatglm3-6b-with-past "return 0;"
working fine.
8) speculative_decoding_lm: ./speculative_decoding_lm /home/roy/chatglm3-6b-with-past /home/roy/Llama-2-7b-chat-hf "Why sun is yellow?"
.
After complete building and installing(e.g openvino_install
) I noticed that openvino_install/samples/cpp
is missing speculative_decoding_lm
, prompt_lookup_decoding_lm
and lora_greedy_causal_lm
folders. So, I manually added this three folders in openvino_install/samples/cpp
and executed ./build_samples.sh
which generate openvino_cpp_samples_build containing executable files for all the samples folder present in openvino_install/samples/cpp
, Is it fine or I am expected to use any other approach or I missed anything?
1) Is likely explained with swapped dimensions for that model. When adding the model to the supported list, please mark (no beam search).
8) Ensure /home/roy/Llama-2-7b-chat-hf/openvino_model.xml
exists after optimum-cli
export for Llama-2-7b-chat-hf
.
@Wovchena
1) As you said I have checked that my /home/roy/Llama-2-7b-chat-hf/openvino_model.xml
and fix that part but even after fixing this getting the following error
2) Also I manually added /openvino.genai/tools/continuous_batching
to openvino_install/samples/cpp
and compiled it with a few extra lines of code in it cmake
find_package(OpenVINOGenAI REQUIRED
HINTS
"${CMAKE_BINARY_DIR}" # Reuse the package from the build.
${OpenVINO_DIR} # GenAI may be installed alogside OpenVINO.
NO_CMAKE_FIND_ROOT_PATH
)
for both the directory(e.g. accuracy
and benchmark
) but at the end get the following errors
Note:- I used optimum-cli export openvino --trust-remote-code --model THUDM/chatglm3-6b chatglm3-6b-with-past --task text-generation-with-past
to download chatglm3-6b
You can use chatglm
as draft and main models for speculative_decoding_lm
. That excludes Llama-2-7b-chat-hf
from the list of problems.
Missing .xml files is strange. Every sample requires them to exist and some of them already passed for you. Double check the folder content.
Undeclared beam_idx
is also strange because every sample relies on it.
I forgot to mention that I've updated the main issue with the description how to install samples. But since you've already figured that out, no action is required. Although your solution is different.
@Wovchena
I have re-executed optimum-cli export openvino --trust-remote-code --model THUDM/chatglm3-6b chatglm3-6b-with-past --task text-generation-with-past
which takes the model already downloaded in my default cache and compress it(below is the image of the whole process)
Below Image shows the contains of chatglm3-6b-with-past
directory
And below image show the error I am still getting on speculative_decoding_lm
, continuous_batching_benchmark
, continuous_batching_accuracy
and continuous_batching_speculative_decoding
As you can see that error is related to beam_idx
, can you guide me that what may have gone wrong or where I need to check
I'm unable to reproduce peculative_decoding_lm
, continuous_batching_benchmark
, and continuous_batching_accuracy
issues. We can still try to investigate it with you in as a background task. Meanwhile you can proceed assuming that they work.
You can check openvino_model.xml
content. There should be a layer named beam_idx
. Example:
<layer id="0" name="beam_idx" type="Parameter" version="opset1">
<data shape="?" element_type="i32" />
<output>
<port id="0" precision="I32" names="beam_idx">
<dim>-1</dim>
</port>
</output>
</layer>
continuous_batching_speculative_decoding
requires -m
and -a
named args, not just paths. @iefode, is it possible to add a validation for cmd args and make them required?
@Wovchena
Did you mean I should create a pull request while assuming speculative_decoding_lm
, continuous_batching_benchmark
, continuous_batching_accuracy
working. Also I too checked openvino_model.xml
earlier and able to locate the suggested portion.
Did you mean I should create a pull request while assuming
speculative_decoding_lm
,continuous_batching_benchmark
,continuous_batching_accuracy
working.
Yes.
@ilya-lavrenov, maybe you can suggest something about failing speculative_decoding_lm
, continuous_batching_benchmark
, continuous_batching_accuracy
?
@Aniruddha521 what OpenVINO version do you use for inference?
Looks like PA transformation have not worked correctly for ChatGLM3-6B
I'm unable to reproduce
peculative_decoding_lm
,continuous_batching_benchmark
, andcontinuous_batching_accuracy
issues. We can still try to investigate it with you in as a background task. Meanwhile you can proceed assuming that they work.You can check
openvino_model.xml
content. There should be a layer namedbeam_idx
. Example:<layer id="0" name="beam_idx" type="Parameter" version="opset1"> <data shape="?" element_type="i32" /> <output> <port id="0" precision="I32" names="beam_idx"> <dim>-1</dim> </port> </output> </layer>
continuous_batching_speculative_decoding
requires-m
and-a
named args, not just paths. @iefode, is it possible to add a validation for cmd args and make them required?
Agree with you @Wovchena to make args to speculative decoding required. But can say that the original problem is reproduced in all CB
samples. Totally agree with @ilya-lavrenov that Looks like PA transformation have not worked correctly for ChatGLM3-6B
And below image show the error I am still getting on
speculative_decoding_lm
,continuous_batching_benchmark
,continuous_batching_accuracy
andcontinuous_batching_speculative_decoding
As you can see that error is related to
beam_idx
, can you guide me that what may have gone wrong or where I need to check
@Aniruddha521 , please make sure you're using the latest version of OpenVINO. I've just successfully run the model and inferred it.
@Wovchena I built openvino with openvino genai using the following commands sequentially:
git clone --recursive https://github.com/openvinotoolkit/openvino.git git clone --recursive https://github.com/openvinotoolkit/openvino.genai.git
cd openvino sudo ./install_build_dependencies.sh mkdir build && cd build cmake -DCMAKE_BUILD_TYPE=Release .. cmake --build . --parallel 14 cd --
cmake --install openvino/build --prefix openvino_install source openvino_install/setupvars.sh
cd openvino.genai cmake -DCMAKE_BUILD_TYPE=Release -S ./ -B ./build/ cmake --build ./build/ --config Release --parallel 14 cmake --install ./build/ --config Release --prefix openvino_install cd openvino_install/samples/cpp
./build_samples.sh cd --
cd openvino_cpp_samples_build/intel64/Release/
1) beam_search_causal_lm:
./beam_search_causal_lm /home/roy/chatglm3-6b-with-past "Why sun is yellow?"
2) benchmark_genai:
./benchmark_genai -m /home/roy/chatglm3-6b-with-past
working fine.3) chat_sample:
./chat_sample /home/roy/chatglm3-6b-with-past
working fine.4) greedy_causal_lm:
./greedy_causal_lm /home/roy/chatglm3-6b-with-past "Why sun is yellow?"
working fine.5) lora_greedy_causal_lm:
./lora_greedy_causal_lm /home/roy/chatglm3-6b-with-past /home/roy/.cache/huggingface/hub/models--THUDM--chatglm3-6b/snapshots/91a0561caa089280e94bf26a9fc3530482f0fe60/model-00001-of-00007.safetensors "Why sun is yellow?"
working fine.6) multinomial_causal_lm:
./multinomial_causal_lm /home/roy/chatglm3-6b-with-past "Why sun is yellow?"
working fine.7) prompt_lookup_decoding_lm:
./prompt_lookup_decoding_lm /home/roy/chatglm3-6b-with-past "return 0;"
working fine.8) speculative_decoding_lm:
./speculative_decoding_lm /home/roy/chatglm3-6b-with-past /home/roy/Llama-2-7b-chat-hf "Why sun is yellow?"
.After complete building and installing(e.g
openvino_install
) I noticed thatopenvino_install/samples/cpp
is missingspeculative_decoding_lm
,prompt_lookup_decoding_lm
andlora_greedy_causal_lm
folders. So, I manually added this three folders inopenvino_install/samples/cpp
and executed./build_samples.sh
which generate openvino_cpp_samples_build containing executable files for all the samples folder present inopenvino_install/samples/cpp
, Is it fine or I am expected to use any other approach or I missed anything?
I re-cloned openvino and openvino.genai and proceed as mention in the above steps and my openvino version in my conda enviroment is 2024.4.0-16579-c3152d32c9c-releases/2024/4
Could you please share scripts or code snippets responsible for implementing PA transformation and beam indexing?I’d like to learn and explore to deepen my understanding.
@ilya-lavrenov @Wovchena @iefode @CuriousPanCake
@Wovchena I built openvino with openvino genai using the following commands sequentially: git clone --recursive https://github.com/openvinotoolkit/openvino.git git clone --recursive https://github.com/openvinotoolkit/openvino.genai.git cd openvino sudo ./install_build_dependencies.sh mkdir build && cd build cmake -DCMAKE_BUILD_TYPE=Release .. cmake --build . --parallel 14 cd -- cmake --install openvino/build --prefix openvino_install source openvino_install/setupvars.sh cd openvino.genai cmake -DCMAKE_BUILD_TYPE=Release -S ./ -B ./build/ cmake --build ./build/ --config Release --parallel 14 cmake --install ./build/ --config Release --prefix openvino_install cd openvino_install/samples/cpp ./build_samples.sh cd -- cd openvino_cpp_samples_build/intel64/Release/ 1) beam_search_causal_lm:
./beam_search_causal_lm /home/roy/chatglm3-6b-with-past "Why sun is yellow?"
2) benchmark_genai:./benchmark_genai -m /home/roy/chatglm3-6b-with-past
working fine. 3) chat_sample:./chat_sample /home/roy/chatglm3-6b-with-past
working fine. 4) greedy_causal_lm:./greedy_causal_lm /home/roy/chatglm3-6b-with-past "Why sun is yellow?"
working fine. 5) lora_greedy_causal_lm:./lora_greedy_causal_lm /home/roy/chatglm3-6b-with-past /home/roy/.cache/huggingface/hub/models--THUDM--chatglm3-6b/snapshots/91a0561caa089280e94bf26a9fc3530482f0fe60/model-00001-of-00007.safetensors "Why sun is yellow?"
working fine. 6) multinomial_causal_lm:./multinomial_causal_lm /home/roy/chatglm3-6b-with-past "Why sun is yellow?"
working fine. 7) prompt_lookup_decoding_lm:./prompt_lookup_decoding_lm /home/roy/chatglm3-6b-with-past "return 0;"
working fine. 8) speculative_decoding_lm:./speculative_decoding_lm /home/roy/chatglm3-6b-with-past /home/roy/Llama-2-7b-chat-hf "Why sun is yellow?"
. After complete building and installing(e.gopenvino_install
) I noticed thatopenvino_install/samples/cpp
is missingspeculative_decoding_lm
,prompt_lookup_decoding_lm
andlora_greedy_causal_lm
folders. So, I manually added this three folders inopenvino_install/samples/cpp
and executed./build_samples.sh
which generate openvino_cpp_samples_build containing executable files for all the samples folder present inopenvino_install/samples/cpp
, Is it fine or I am expected to use any other approach or I missed anything?I re-cloned openvino and openvino.genai and proceed as mention in the above steps and my openvino version in my conda enviroment is
2024.4.0-16579-c3152d32c9c-releases/2024/4
Could you please share scripts or code snippets responsible for implementing PA transformation and beam indexing?I’d like to learn and explore to deepen my understanding. @ilya-lavrenov @Wovchena @iefode @CuriousPanCake
I think, the fix for your issue may not be in 2024.4.0, but it is present on the current master.
@CuriousPanCake I executed the below mention commands sequentially to build openvino with openvino genai
git clone --recursive https://github.com/openvinotoolkit/openvino.git
git clone --recursive https://github.com/openvinotoolkit/openvino.genai.git
cd openvino
sudo ./install_build_dependencies.sh
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
cmake --build . --parallel 14
cd --
cmake --install openvino/build --prefix openvino_install
source openvino_install/setupvars.sh
cd openvino.genai
cmake -DCMAKE_BUILD_TYPE=Release -S ./ -B ./build/
cmake --build ./build/ --config Release --parallel 14
cd ..
cmake --install openvino.genai/build/ --config Release --prefix openvino_install
cd openvino_install/samples/cpp
./build_samples.sh
cd --
If there is anything I have missed then please let me know.
You mentioned that this issue can be resolved by using the current master, can you provide more clarity regarding this? I also tried export PYTHONPATH=Path_to_cloned_directory
but the result remain same.
Also can you share the build commands you have used?
Can anyone help me in this matter. I am getting this error while checking https://github.com/openvinotoolkit/openvino.genai/tree/master/tests/python_tests#customise-tests-run tests
Also the version of openvino_genai == 2024.5.0.0 in the build prefix(openvino_install) where as in my conda enviroment it is 2024.4.0.0
and when using pip install openvino-genai==2024.5.0.0
it is showing
ERROR: Could not find a version that satisfies the requirement openvino-genai==2024.5.0.0 (from versions: 2024.2.0.0, 2024.3.0.0, 2024.4.0.0, 2024.4.1.0.dev20240926)
ERROR: No matching distribution found for openvino-genai==2024.5.0.0
Which I think is because 2024.5.0.0 is not release.
@ilya-lavrenov @Wovchena @iefode @CuriousPanCake
and when using pip install openvino-genai==2024.5.0.0 it is showing
OpenVINO 2024.5.0 is not released yet. It's available as pre-release package and should be installed with extra options --pre --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly
You mentioned that this issue can be resolved by using the current master, can you provide more clarity regarding this?
I was able to run speculative_decoding_lm
from docker: sudo docker run -it ubuntu:20.04 /bin/bash
. You can try the same to verify it works for you. If it passes, you need to find what part diverged in your steps.
cd ~
apt update
apt install git python3.9 -y
apt install python3.9-dev -y
git clone --recursive https://github.com/openvinotoolkit/openvino.git
git clone --recursive https://github.com/openvinotoolkit/openvino.genai.git
cd openvino
./install_build_dependencies.sh
mkdir build && cd build
cmake -DENABLE_PYTHON=ON -DPython3_EXECUTABLE=/usr/bin/python3.9 -DCMAKE_BUILD_TYPE=Release ..
cmake --build . --parallel 14
cd --
cmake --install openvino/build --prefix openvino_install
source openvino_install/setupvars.sh
cd openvino.genai
cmake -DCMAKE_BUILD_TYPE=Release -S ./ -B ./build/
cmake --build ./build/ --config Release --parallel 14
cd ..
cmake --install openvino.genai/build/ --config Release --prefix openvino_install
cd openvino_install/samples/cpp
./build_samples.sh
cd --
python3.9 -m pip install -r ~/openvino.genai/samples/requirements.txt
export PYTHONPATH=/root/openvino_install/python/
python3.9 -m pip install openvino.genai/thirdparty/openvino_tokenizers/ --pre --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly
optimum-cli export openvino --trust-remote-code --task text-generation-with-past --model THUDM/chatglm3-6b chatglm3-6b
./openvino_cpp_samples_build/intel64/Release/speculative_decoding_lm chatglm3-6b/ chatglm3-6b/ "Why is the Sun yellow?"
You mentioned that this issue can be resolved by using the current master, can you provide more clarity regarding this?
I was able to run
speculative_decoding_lm
from docker:sudo docker run -it ubuntu:20.04 /bin/bash
. You can try the same to verify it works for you. If it passes, you need to find what part diverged in your steps.cd ~ apt update apt install git python3.9 -y apt install python3.9-dev -y git clone --recursive https://github.com/openvinotoolkit/openvino.git git clone --recursive https://github.com/openvinotoolkit/openvino.genai.git cd openvino ./install_build_dependencies.sh mkdir build && cd build cmake -DENABLE_PYTHON=ON -DPython3_EXECUTABLE=/usr/bin/python3.9 -DCMAKE_BUILD_TYPE=Release .. cmake --build . --parallel 14 cd -- cmake --install openvino/build --prefix openvino_install source openvino_install/setupvars.sh cd openvino.genai cmake -DCMAKE_BUILD_TYPE=Release -S ./ -B ./build/ cmake --build ./build/ --config Release --parallel 14 cd .. cmake --install openvino.genai/build/ --config Release --prefix openvino_install cd openvino_install/samples/cpp ./build_samples.sh cd -- python3.9 -m pip install -r ~/openvino.genai/samples/requirements.txt export PYTHONPATH=/root/openvino_install/python/ python3.9 -m pip install openvino.genai/thirdparty/openvino_tokenizers/ --pre --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly optimum-cli export openvino --trust-remote-code --task text-generation-with-past --model THUDM/chatglm3-6b chatglm3-6b ./openvino_cpp_samples_build/intel64/Release/speculative_decoding_lm chatglm3-6b/ chatglm3-6b/ "Why is the Sun yellow?"
@Wovchena
I too need to proceed with almost same sequence of commands but I have ubuntu 24
and python 3.11
, and this python3.9 -m pip install openvino.genai/thirdparty/openvino_tokenizers/ --pre --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly
extra line which I was missing. But then also I noticed a problem that if I skip export PYTHONPATH=/root/openvino_install/python/
line I get the following error:
while if not skipped then:
Were you able to reproduce it in docker?
Were you able to reproduce it in docker?
Probably yes, since after executing ./openvino_cpp_samples_build/intel64/Release/speculative_decoding_lm chatglm3-6b/ chatglm3-6b/ "Why is the Sun yellow?"
my laptop used to lag and remain non-responcive for few minutes after which it output killed
may be due to high computational resource requirement?
@Wovchena
I proceed as mentioned in the task https://github.com/openvinotoolkit/openvino.genai/issues/259 with the following changes. 1) Extended the nightly_model in the file openvino.genai/tests/python_tests/ov_genai_test_utils.py
nightly_models = [
"TinyLlama/TinyLlama-1.1B-Chat-v1.0",
"facebook/opt-125m",
"microsoft/phi-1_5",
"microsoft/phi-2",
"THUDM/chatglm2-6b",
"THUDM/chatglm3-6b", # no beam_search
"Qwen/Qwen2-0.5B-Instruct",
"Qwen/Qwen-7B-Chat",
"Qwen/Qwen1.5-7B-Chat",
"argilla/notus-7b-v1",
"HuggingFaceH4/zephyr-7b-beta",
"ikala/redpajama-3b-chat",
"mistralai/Mistral-7B-v0.1",
2) Added model to https://github.com/openvinotoolkit/openvino.genai/blob/84702501b688590457e268f4b7f9c2b0bc012c1b/.github/workflows/causal_lm_cpp.yml#L62
cpp-greedy_causal_lm-Chatglm3-6b
and cpp-prompt_lookup_decoding_lm-ubuntu-Chatglm3-6b
If I missed anything or any modification is needed then please let me know I will be glad to modify. I appreciate any help.
You also need to extend the supported models list. Add a note that beam_search_causal_lm isn't supported. Where can I find a pull request?
@Wovchena You can find the pull request here.
Context
This task regards enabling tests for chatglm3-6b. You can find more details under openvino_notebooks LLM chatbot README.md.
Please ask general questions in the main issue at https://github.com/openvinotoolkit/openvino.genai/issues/259
What needs to be done?
Described in the main Discussion issue at: https://github.com/openvinotoolkit/openvino.genai/issues/259
Example Pull Requests
Described in the main Discussion issue at: https://github.com/openvinotoolkit/openvino.genai/issues/259
Resources
Contact points
Described in the main Discussion issue at: https://github.com/openvinotoolkit/openvino.genai/issues/259
Ticket
No response