[Good First Issue]: Verify chatglm3-6b with GenAI text_generation

p-wysocki commented 9 months ago

Context

This task regards enabling tests for chatglm3-6b. You can find more details under openvino_notebooks LLM chatbot README.md.

Please ask general questions in the main issue at https://github.com/openvinotoolkit/openvino.genai/issues/259

What needs to be done?

Described in the main Discussion issue at: https://github.com/openvinotoolkit/openvino.genai/issues/259

Example Pull Requests

Described in the main Discussion issue at: https://github.com/openvinotoolkit/openvino.genai/issues/259

Resources

Contribution guide - start here!
Intel DevHub Discord channel - engage in discussions, ask questions and talk to OpenVINO developers

Contact points

Described in the main Discussion issue at: https://github.com/openvinotoolkit/openvino.genai/issues/259

Ticket

No response

Aniruddha521 commented 1 month ago

@Wovchena 1) beam_search_causal_lm : Working fine 2) benchmark_genai: Working fine 3) chat_sample: Working fine 4) continuous_batching_accuracy: 5) continuous_batching_benchmark: 6) greedy_causal_lm: Working fine 7) lora_greedy_causal_lm: 8) multinomial_causal_lm: Working fine 9) prompt_lookup_decoding_lm: Working fine 10) speculative_decoding_lm: Working fine

As you can see except continuous_batching_accuracy, continuous_batching_benchmark and lora_greedy_causal_lm is raising error except them all are working fine with chatglm3-6b. How can I fix the errors for 4, 5 and 7.

Wovchena commented 1 month ago

Can you share the build commands?

I see on your screenshot that the error comes from openvino_genai... which is usually the name for a prebuilt GenAI. It also states that the used version is 24.4. While it's the latest released version, it's an outdated version from the development point of view. I encourage you to compile the whole project on your own following https://github.com/openvinotoolkit/openvino.genai/blob/master/src/docs/BUILD.md

Aniruddha521 commented 1 month ago

@Wovchena I built openvino with openvino genai using the following commands sequentially:

git clone --recursive https://github.com/openvinotoolkit/openvino.git
git clone --recursive https://github.com/openvinotoolkit/openvino.genai.git

cd openvino
sudo ./install_build_dependencies.sh
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
cmake --build . --parallel 14
cd --

cmake --install openvino/build --prefix openvino_install
source openvino_install/setupvars.sh

cd openvino.genai
cmake -DCMAKE_BUILD_TYPE=Release -S ./ -B ./build/
cmake --build ./build/ --config Release --parallel 14
cmake --install ./build/ --config Release --prefix openvino_install
cd openvino_install/samples/cpp

./build_samples.sh
cd --

cd openvino_cpp_samples_build/intel64/Release/

1) beam_search_causal_lm: ./beam_search_causal_lm /home/roy/chatglm3-6b-with-past "Why sun is yellow?"

2) benchmark_genai: ./benchmark_genai -m /home/roy/chatglm3-6b-with-past working fine.

3) chat_sample: ./chat_sample /home/roy/chatglm3-6b-with-past working fine.

4) greedy_causal_lm: ./greedy_causal_lm /home/roy/chatglm3-6b-with-past "Why sun is yellow?" working fine.

5) lora_greedy_causal_lm: ./lora_greedy_causal_lm /home/roy/chatglm3-6b-with-past /home/roy/.cache/huggingface/hub/models--THUDM--chatglm3-6b/snapshots/91a0561caa089280e94bf26a9fc3530482f0fe60/model-00001-of-00007.safetensors "Why sun is yellow?" working fine.

6) multinomial_causal_lm: ./multinomial_causal_lm /home/roy/chatglm3-6b-with-past "Why sun is yellow?" working fine.

7) prompt_lookup_decoding_lm: ./prompt_lookup_decoding_lm /home/roy/chatglm3-6b-with-past "return 0;" working fine.

8) speculative_decoding_lm: ./speculative_decoding_lm /home/roy/chatglm3-6b-with-past /home/roy/Llama-2-7b-chat-hf "Why sun is yellow?".

After complete building and installing(e.g openvino_install) I noticed that openvino_install/samples/cpp is missing speculative_decoding_lm, prompt_lookup_decoding_lm and lora_greedy_causal_lm folders. So, I manually added this three folders in openvino_install/samples/cpp and executed ./build_samples.sh which generate openvino_cpp_samples_build containing executable files for all the samples folder present in openvino_install/samples/cpp , Is it fine or I am expected to use any other approach or I missed anything?

Wovchena commented 1 month ago

1) Is likely explained with swapped dimensions for that model. When adding the model to the supported list, please mark (no beam search). 8) Ensure /home/roy/Llama-2-7b-chat-hf/openvino_model.xml exists after optimum-cli export for Llama-2-7b-chat-hf.

Aniruddha521 commented 1 month ago

@Wovchena 1) As you said I have checked that my /home/roy/Llama-2-7b-chat-hf/openvino_model.xml and fix that part but even after fixing this getting the following error 2) Also I manually added /openvino.genai/tools/continuous_batching to openvino_install/samples/cpp and compiled it with a few extra lines of code in it cmake

find_package(OpenVINOGenAI REQUIRED
    HINTS
        "${CMAKE_BINARY_DIR}"  # Reuse the package from the build.
        ${OpenVINO_DIR}  # GenAI may be installed alogside OpenVINO.
    NO_CMAKE_FIND_ROOT_PATH
)

for both the directory(e.g. accuracy and benchmark ) but at the end get the following errors Note:- I used optimum-cli export openvino --trust-remote-code --model THUDM/chatglm3-6b chatglm3-6b-with-past --task text-generation-with-past to download chatglm3-6b

Wovchena commented 1 month ago

You can use chatglm as draft and main models for speculative_decoding_lm. That excludes Llama-2-7b-chat-hf from the list of problems.

Missing .xml files is strange. Every sample requires them to exist and some of them already passed for you. Double check the folder content.

Undeclared beam_idx is also strange because every sample relies on it.

I forgot to mention that I've updated the main issue with the description how to install samples. But since you've already figured that out, no action is required. Although your solution is different.

Aniruddha521 commented 1 month ago

@Wovchena I have re-executed optimum-cli export openvino --trust-remote-code --model THUDM/chatglm3-6b chatglm3-6b-with-past --task text-generation-with-past which takes the model already downloaded in my default cache and compress it(below is the image of the whole process)

Below Image shows the contains of chatglm3-6b-with-past directory

And below image show the error I am still getting on speculative_decoding_lm, continuous_batching_benchmark, continuous_batching_accuracy and continuous_batching_speculative_decoding

As you can see that error is related to beam_idx , can you guide me that what may have gone wrong or where I need to check

Wovchena commented 1 month ago

I'm unable to reproduce peculative_decoding_lm, continuous_batching_benchmark, and continuous_batching_accuracy issues. We can still try to investigate it with you in as a background task. Meanwhile you can proceed assuming that they work.

You can check openvino_model.xml content. There should be a layer named beam_idx. Example:

        <layer id="0" name="beam_idx" type="Parameter" version="opset1">
            <data shape="?" element_type="i32" />
            <output>
                <port id="0" precision="I32" names="beam_idx">
                    <dim>-1</dim>
                </port>
            </output>
        </layer>

continuous_batching_speculative_decoding requires -m and -a named args, not just paths. @iefode, is it possible to add a validation for cmd args and make them required?

Aniruddha521 commented 1 month ago

@Wovchena Did you mean I should create a pull request while assuming speculative_decoding_lm, continuous_batching_benchmark, continuous_batching_accuracy working. Also I too checked openvino_model.xml earlier and able to locate the suggested portion.

Wovchena commented 1 month ago

Did you mean I should create a pull request while assuming speculative_decoding_lm, continuous_batching_benchmark, continuous_batching_accuracy working.

Yes.

@ilya-lavrenov, maybe you can suggest something about failing speculative_decoding_lm, continuous_batching_benchmark, continuous_batching_accuracy?

ilya-lavrenov commented 1 month ago

@Aniruddha521 what OpenVINO version do you use for inference?

Looks like PA transformation have not worked correctly for ChatGLM3-6B

iefode commented 1 month ago

I'm unable to reproduce peculative_decoding_lm, continuous_batching_benchmark, and continuous_batching_accuracy issues. We can still try to investigate it with you in as a background task. Meanwhile you can proceed assuming that they work.

You can check openvino_model.xml content. There should be a layer named beam_idx. Example:
      <layer id="0" name="beam_idx" type="Parameter" version="opset1">
          <data shape="?" element_type="i32" />
          <output>
              <port id="0" precision="I32" names="beam_idx">
                  <dim>-1</dim>
              </port>
          </output>
      </layer>
continuous_batching_speculative_decoding requires -m and -a named args, not just paths. @iefode, is it possible to add a validation for cmd args and make them required?

Agree with you @Wovchena to make args to speculative decoding required. But can say that the original problem is reproduced in all CB samples. Totally agree with @ilya-lavrenov that Looks like PA transformation have not worked correctly for ChatGLM3-6B

CuriousPanCake commented 1 month ago

And below image show the error I am still getting on speculative_decoding_lm, continuous_batching_benchmark, continuous_batching_accuracy and continuous_batching_speculative_decoding

As you can see that error is related to beam_idx , can you guide me that what may have gone wrong or where I need to check

@Aniruddha521 , please make sure you're using the latest version of OpenVINO. I've just successfully run the model and inferred it.

Aniruddha521 commented 1 month ago

@Wovchena I built openvino with openvino genai using the following commands sequentially:

git clone --recursive https://github.com/openvinotoolkit/openvino.git git clone --recursive https://github.com/openvinotoolkit/openvino.genai.git

cd openvino sudo ./install_build_dependencies.sh mkdir build && cd build cmake -DCMAKE_BUILD_TYPE=Release .. cmake --build . --parallel 14 cd --

cmake --install openvino/build --prefix openvino_install source openvino_install/setupvars.sh

cd openvino.genai cmake -DCMAKE_BUILD_TYPE=Release -S ./ -B ./build/ cmake --build ./build/ --config Release --parallel 14 cmake --install ./build/ --config Release --prefix openvino_install cd openvino_install/samples/cpp

./build_samples.sh cd --

cd openvino_cpp_samples_build/intel64/Release/

1) beam_search_causal_lm: ./beam_search_causal_lm /home/roy/chatglm3-6b-with-past "Why sun is yellow?"

2) benchmark_genai: ./benchmark_genai -m /home/roy/chatglm3-6b-with-past working fine.

3) chat_sample: ./chat_sample /home/roy/chatglm3-6b-with-past working fine.

4) greedy_causal_lm: ./greedy_causal_lm /home/roy/chatglm3-6b-with-past "Why sun is yellow?" working fine.

5) lora_greedy_causal_lm: ./lora_greedy_causal_lm /home/roy/chatglm3-6b-with-past /home/roy/.cache/huggingface/hub/models--THUDM--chatglm3-6b/snapshots/91a0561caa089280e94bf26a9fc3530482f0fe60/model-00001-of-00007.safetensors "Why sun is yellow?" working fine.

6) multinomial_causal_lm: ./multinomial_causal_lm /home/roy/chatglm3-6b-with-past "Why sun is yellow?" working fine.

7) prompt_lookup_decoding_lm: ./prompt_lookup_decoding_lm /home/roy/chatglm3-6b-with-past "return 0;" working fine.

8) speculative_decoding_lm: ./speculative_decoding_lm /home/roy/chatglm3-6b-with-past /home/roy/Llama-2-7b-chat-hf "Why sun is yellow?".

After complete building and installing(e.g openvino_install) I noticed that openvino_install/samples/cpp is missing speculative_decoding_lm, prompt_lookup_decoding_lm and lora_greedy_causal_lm folders. So, I manually added this three folders in openvino_install/samples/cpp and executed ./build_samples.sh which generate openvino_cpp_samples_build containing executable files for all the samples folder present in openvino_install/samples/cpp , Is it fine or I am expected to use any other approach or I missed anything?

I re-cloned openvino and openvino.genai and proceed as mention in the above steps and my openvino version in my conda enviroment is 2024.4.0-16579-c3152d32c9c-releases/2024/4 Could you please share scripts or code snippets responsible for implementing PA transformation and beam indexing?I’d like to learn and explore to deepen my understanding. @ilya-lavrenov @Wovchena @iefode @CuriousPanCake

CuriousPanCake commented 1 month ago

@Wovchena I built openvino with openvino genai using the following commands sequentially: git clone --recursive https://github.com/openvinotoolkit/openvino.git git clone --recursive https://github.com/openvinotoolkit/openvino.genai.git cd openvino sudo ./install_build_dependencies.sh mkdir build && cd build cmake -DCMAKE_BUILD_TYPE=Release .. cmake --build . --parallel 14 cd -- cmake --install openvino/build --prefix openvino_install source openvino_install/setupvars.sh cd openvino.genai cmake -DCMAKE_BUILD_TYPE=Release -S ./ -B ./build/ cmake --build ./build/ --config Release --parallel 14 cmake --install ./build/ --config Release --prefix openvino_install cd openvino_install/samples/cpp ./build_samples.sh cd -- cd openvino_cpp_samples_build/intel64/Release/ 1) beam_search_causal_lm: ./beam_search_causal_lm /home/roy/chatglm3-6b-with-past "Why sun is yellow?" 2) benchmark_genai: ./benchmark_genai -m /home/roy/chatglm3-6b-with-past working fine. 3) chat_sample: ./chat_sample /home/roy/chatglm3-6b-with-past working fine. 4) greedy_causal_lm: ./greedy_causal_lm /home/roy/chatglm3-6b-with-past "Why sun is yellow?" working fine. 5) lora_greedy_causal_lm: ./lora_greedy_causal_lm /home/roy/chatglm3-6b-with-past /home/roy/.cache/huggingface/hub/models--THUDM--chatglm3-6b/snapshots/91a0561caa089280e94bf26a9fc3530482f0fe60/model-00001-of-00007.safetensors "Why sun is yellow?" working fine. 6) multinomial_causal_lm: ./multinomial_causal_lm /home/roy/chatglm3-6b-with-past "Why sun is yellow?" working fine. 7) prompt_lookup_decoding_lm: ./prompt_lookup_decoding_lm /home/roy/chatglm3-6b-with-past "return 0;" working fine. 8) speculative_decoding_lm: ./speculative_decoding_lm /home/roy/chatglm3-6b-with-past /home/roy/Llama-2-7b-chat-hf "Why sun is yellow?". After complete building and installing(e.g openvino_install) I noticed that openvino_install/samples/cpp is missing speculative_decoding_lm, prompt_lookup_decoding_lm and lora_greedy_causal_lm folders. So, I manually added this three folders in openvino_install/samples/cpp and executed ./build_samples.sh which generate openvino_cpp_samples_build containing executable files for all the samples folder present in openvino_install/samples/cpp , Is it fine or I am expected to use any other approach or I missed anything?

I re-cloned openvino and openvino.genai and proceed as mention in the above steps and my openvino version in my conda enviroment is 2024.4.0-16579-c3152d32c9c-releases/2024/4 Could you please share scripts or code snippets responsible for implementing PA transformation and beam indexing?I’d like to learn and explore to deepen my understanding. @ilya-lavrenov @Wovchena @iefode @CuriousPanCake

I think, the fix for your issue may not be in 2024.4.0, but it is present on the current master.

Aniruddha521 commented 1 month ago

@CuriousPanCake I executed the below mention commands sequentially to build openvino with openvino genai

git clone --recursive https://github.com/openvinotoolkit/openvino.git
git clone --recursive https://github.com/openvinotoolkit/openvino.genai.git
cd openvino
sudo ./install_build_dependencies.sh
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
cmake --build . --parallel 14
cd --
cmake --install openvino/build --prefix openvino_install
source openvino_install/setupvars.sh
cd openvino.genai
cmake -DCMAKE_BUILD_TYPE=Release -S ./ -B ./build/
cmake --build ./build/ --config Release --parallel 14
cd ..
cmake --install openvino.genai/build/ --config Release --prefix openvino_install
cd openvino_install/samples/cpp
./build_samples.sh
cd --

If there is anything I have missed then please let me know. You mentioned that this issue can be resolved by using the current master, can you provide more clarity regarding this? I also tried export PYTHONPATH=Path_to_cloned_directory but the result remain same. Also can you share the build commands you have used?

Aniruddha521 commented 4 weeks ago

Can anyone help me in this matter. I am getting this error while checking https://github.com/openvinotoolkit/openvino.genai/tree/master/tests/python_tests#customise-tests-run tests Also the version of openvino_genai == 2024.5.0.0 in the build prefix(openvino_install) where as in my conda enviroment it is 2024.4.0.0 and when using pip install openvino-genai==2024.5.0.0 it is showing

ERROR: Could not find a version that satisfies the requirement openvino-genai==2024.5.0.0 (from versions: 2024.2.0.0, 2024.3.0.0, 2024.4.0.0, 2024.4.1.0.dev20240926)
ERROR: No matching distribution found for openvino-genai==2024.5.0.0

Which I think is because 2024.5.0.0 is not release.

@ilya-lavrenov @Wovchena @iefode @CuriousPanCake

ilya-lavrenov commented 4 weeks ago

and when using pip install openvino-genai==2024.5.0.0 it is showing

OpenVINO 2024.5.0 is not released yet. It's available as pre-release package and should be installed with extra options --pre --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly

See https://docs.openvino.ai/2024/get-started/install-openvino.html?PACKAGE=OPENVINO_BASE&VERSION=NIGHTLY&OP_SYSTEM=WINDOWS&DISTRIBUTION=PIP

Wovchena commented 4 weeks ago

You mentioned that this issue can be resolved by using the current master, can you provide more clarity regarding this?

I was able to run speculative_decoding_lm from docker: sudo docker run -it ubuntu:20.04 /bin/bash. You can try the same to verify it works for you. If it passes, you need to find what part diverged in your steps.

cd ~
apt update
apt install git python3.9 -y
apt install python3.9-dev -y
git clone --recursive https://github.com/openvinotoolkit/openvino.git
git clone --recursive https://github.com/openvinotoolkit/openvino.genai.git
cd openvino
./install_build_dependencies.sh
mkdir build && cd build
cmake -DENABLE_PYTHON=ON -DPython3_EXECUTABLE=/usr/bin/python3.9 -DCMAKE_BUILD_TYPE=Release ..
cmake --build . --parallel 14
cd --
cmake --install openvino/build --prefix openvino_install
source openvino_install/setupvars.sh
cd openvino.genai
cmake -DCMAKE_BUILD_TYPE=Release -S ./ -B ./build/
cmake --build ./build/ --config Release --parallel 14
cd ..
cmake --install openvino.genai/build/ --config Release --prefix openvino_install
cd openvino_install/samples/cpp
./build_samples.sh
cd --
python3.9 -m pip install -r ~/openvino.genai/samples/requirements.txt
export PYTHONPATH=/root/openvino_install/python/
python3.9 -m pip install openvino.genai/thirdparty/openvino_tokenizers/ --pre --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly
optimum-cli export openvino --trust-remote-code --task text-generation-with-past --model THUDM/chatglm3-6b chatglm3-6b
./openvino_cpp_samples_build/intel64/Release/speculative_decoding_lm chatglm3-6b/ chatglm3-6b/ "Why is the Sun yellow?"

Aniruddha521 commented 4 weeks ago

You mentioned that this issue can be resolved by using the current master, can you provide more clarity regarding this?

I was able to run speculative_decoding_lm from docker: sudo docker run -it ubuntu:20.04 /bin/bash. You can try the same to verify it works for you. If it passes, you need to find what part diverged in your steps.

cd ~ apt update apt install git python3.9 -y apt install python3.9-dev -y git clone --recursive https://github.com/openvinotoolkit/openvino.git git clone --recursive https://github.com/openvinotoolkit/openvino.genai.git cd openvino ./install_build_dependencies.sh mkdir build && cd build cmake -DENABLE_PYTHON=ON -DPython3_EXECUTABLE=/usr/bin/python3.9 -DCMAKE_BUILD_TYPE=Release .. cmake --build . --parallel 14 cd -- cmake --install openvino/build --prefix openvino_install source openvino_install/setupvars.sh cd openvino.genai cmake -DCMAKE_BUILD_TYPE=Release -S ./ -B ./build/ cmake --build ./build/ --config Release --parallel 14 cd .. cmake --install openvino.genai/build/ --config Release --prefix openvino_install cd openvino_install/samples/cpp ./build_samples.sh cd -- python3.9 -m pip install -r ~/openvino.genai/samples/requirements.txt export PYTHONPATH=/root/openvino_install/python/ python3.9 -m pip install openvino.genai/thirdparty/openvino_tokenizers/ --pre --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly optimum-cli export openvino --trust-remote-code --task text-generation-with-past --model THUDM/chatglm3-6b chatglm3-6b ./openvino_cpp_samples_build/intel64/Release/speculative_decoding_lm chatglm3-6b/ chatglm3-6b/ "Why is the Sun yellow?"

@Wovchena

I too need to proceed with almost same sequence of commands but I have ubuntu 24 and python 3.11, and this python3.9 -m pip install openvino.genai/thirdparty/openvino_tokenizers/ --pre --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly extra line which I was missing. But then also I noticed a problem that if I skip export PYTHONPATH=/root/openvino_install/python/ line I get the following error:

while if not skipped then:

Wovchena commented 3 weeks ago

Were you able to reproduce it in docker?

Aniruddha521 commented 3 weeks ago

Were you able to reproduce it in docker?

Probably yes, since after executing ./openvino_cpp_samples_build/intel64/Release/speculative_decoding_lm chatglm3-6b/ chatglm3-6b/ "Why is the Sun yellow?" my laptop used to lag and remain non-responcive for few minutes after which it output killed may be due to high computational resource requirement?

Aniruddha521 commented 3 weeks ago

@Wovchena

I proceed as mentioned in the task https://github.com/openvinotoolkit/openvino.genai/issues/259 with the following changes. 1) Extended the nightly_model in the file openvino.genai/tests/python_tests/ov_genai_test_utils.py

nightly_models = [
        "TinyLlama/TinyLlama-1.1B-Chat-v1.0",
        "facebook/opt-125m",
        "microsoft/phi-1_5",
        "microsoft/phi-2",
        "THUDM/chatglm2-6b",
        "THUDM/chatglm3-6b", # no beam_search
        "Qwen/Qwen2-0.5B-Instruct",
        "Qwen/Qwen-7B-Chat",
        "Qwen/Qwen1.5-7B-Chat",
        "argilla/notus-7b-v1",
        "HuggingFaceH4/zephyr-7b-beta",
        "ikala/redpajama-3b-chat",
        "mistralai/Mistral-7B-v0.1",

2) Added model to https://github.com/openvinotoolkit/openvino.genai/blob/84702501b688590457e268f4b7f9c2b0bc012c1b/.github/workflows/causal_lm_cpp.yml#L62 cpp-greedy_causal_lm-Chatglm3-6b and cpp-prompt_lookup_decoding_lm-ubuntu-Chatglm3-6b

If I missed anything or any modification is needed then please let me know I will be glad to modify. I appreciate any help.

Wovchena commented 3 weeks ago

You also need to extend the supported models list. Add a note that beam_search_causal_lm isn't supported. Where can I find a pull request?

Aniruddha521 commented 3 weeks ago

@Wovchena You can find the pull request here.

openvinotoolkit / openvino.genai