openvinotoolkit / openvino.genai

Run Generative AI models with simple C++/Python API and using OpenVINO Runtime
Apache License 2.0
155 stars 174 forks source link

[Good First Issue]: Verify chatglm3-6b with GenAI text_generation #268

Open p-wysocki opened 8 months ago

p-wysocki commented 8 months ago

Context

This task regards enabling tests for chatglm3-6b. You can find more details under openvino_notebooks LLM chatbot README.md.

Please ask general questions in the main issue at https://github.com/openvinotoolkit/openvino.genai/issues/259

What needs to be done?

Described in the main Discussion issue at: https://github.com/openvinotoolkit/openvino.genai/issues/259

Example Pull Requests

Described in the main Discussion issue at: https://github.com/openvinotoolkit/openvino.genai/issues/259

Resources

Contact points

Described in the main Discussion issue at: https://github.com/openvinotoolkit/openvino.genai/issues/259

Ticket

No response

Utkarsh-2002 commented 8 months ago

.take

github-actions[bot] commented 8 months ago

Thank you for looking into this issue! Please let us know if you have any questions or require any help.

p-wysocki commented 8 months ago

Hello @Utkarsh-2002, are you still working on this? Is there anything we could help you with?

Utkarsh-2002 commented 8 months ago

yes i am working on this can and there is some issue with the complilation part but i am working on it will let you know if a will need any help

p-wysocki commented 7 months ago

Hello @Utkarsh-2002, please let me know if you're still working on this, for now I'm unassigning you due to long inactivity.

HikaruSadashi commented 7 months ago

.take

github-actions[bot] commented 7 months ago

Thank you for looking into this issue! Please let us know if you have any questions or require any help.

duydl commented 6 months ago

.take

github-actions[bot] commented 6 months ago

Thank you for looking into this issue! Please let us know if you have any questions or require any help.

duydl commented 6 months ago

Sorry, I could not access lab PC till next week. My laptop is a little short for the task. So leave it open for others in the mean time.

p-wysocki commented 6 months ago

No worries, come back anytime you feel like contributing. You're always welcome :)

Jessielovecodings commented 5 months ago

WLB

.take

github-actions[bot] commented 5 months ago

Thank you for looking into this issue! Please let us know if you have any questions or require any help.

rk119 commented 4 months ago

Hi @Jessielovecodings, are you still working on this issue? If not, I'd like to work on it. cc @p-wysocki

p-wysocki commented 4 months ago

I think we can safely assume the task has been abandoned. I'm assigning you @rk119, thanks for taking a look!

rk119 commented 4 months ago

Thank you for assigning it to me @p-wysocki! I'm on it.

rk119 commented 4 months ago

Hi @p-wysocki,

I am experiencing a few errors while building openvino.genai and would appreciate some guidance on this. I have followed the steps specified in this guide with slight modifications as some steps aren't valid (like unzip). Here's what I did with a few screenshots of the results:

git clone --recursive https://github.com/openvinotoolkit/openvino.genai.git

cd openvino.genai

git submodule update --remote --init

cd ../

curl --output ov.zip https://storage.openvinotoolkit.org/repositories/openvino/packages/nightly/2024.3.0-15805-6138d624dc1/w_openvino_toolkit_windows_2024.3.0.dev20240626_x86_64.zip

tar -xf ov.zip

move w_openvino_toolkit_windows_2024.3.0.dev20240626_x86_64 "C:\path\to\openvino.genai" cd openvino.genai

mklink /D ov w_openvino_toolkit_windows_2024.3.0.dev20240626_x86_64

call ov\setupvars.bat

setupvars-success

cmake -DCMAKE_BUILD_TYPE=Release -S ./ -B ./build/

cmake1

cmake --build ./build/ --config Release --target package -j

cmake2

cmake --install ./build/ --config Release --prefix ov

cmake3

Note: I am using a Windows 11 machine.

p-wysocki commented 4 months ago

@Wovchena could you please take a look?

Wovchena commented 4 months ago

The link to zip file is outdated. Use https://storage.openvinotoolkit.org/repositories/openvino/packages/nightly/2024.4.0-16067-3f4afc92488/w_openvino_toolkit_windows_2024.4.0.dev20240717_x86_64.zip instead

rk119 commented 4 months ago

The link to zip file is outdated. Use https://storage.openvinotoolkit.org/repositories/openvino/packages/nightly/2024.4.0-16067-3f4afc92488/w_openvino_toolkit_windows_2024.4.0.dev20240717_x86_64.zip instead

With the updated curl command and the steps mentioned before, after running cmake --build ./build/ --config Release --target package -j I received the following error:

Error

Wovchena commented 4 months ago

You have incorrect environment. Try running it from Developer Command Prompt for VS 2022 or 2019 cmd. If it doesn't help, you are probably missing https://nsis.sourceforge.io/Download. Install it.

rk119 commented 4 months ago

You have incorrect environment. Try running it from Developer Command Prompt for VS 2022 or 2019 cmd. If it doesn't help, you are probably missing https://nsis.sourceforge.io/Download. Install it.

I found the issue, I was missing NSIS and installed it. The commands worked with no errors. I was trying to run the beam_search_causal_lm sample by following this readme and am encountering a small issue when I run the command beam_search_causal_lm TinyLlama-1.1B-Chat-v1.0 "Why is the Sun yellow?":

image

I tried compiling the cpp file first as well using g++. I may have

Edit: I fixed it. I navigated to openvino.genai\build\samples\cpp\beam_search_causal_lm\Release and executed the command there and it worked! Would like to confirm if this is where I was supposed to run the commands.

Wovchena commented 4 months ago

Yes, that's correct

rk119 commented 4 months ago

@Wovchena When I ran the command optimum-cli export openvino --trust-remote-code --model THUDM/chatglm3-6b chatglm3-6b to run the beam_search_causal_lm sample, I encounter the following issue:

image

However, for the text-generation task it executes. For this sample, I need to run the feature-extraction task which is giving an error. I went ahead and experimented with exporting the mpt-7b-chat model from the issue #267 and it exported successfully. I'm assuming chatglm3-6b is not supported.

Wovchena commented 4 months ago

@eaidova, can --task auto-detection be fixed for THUDM/chatglm3-6b? optimum-cli detects feature-extraction and fails to export, while --task text-generation passes. Should THUDM/chatglm3-6b's authors be asked to update its task?

@rk119, proceed with exporting the model with explicit --task text-generation. If it's not fixed by the time you are ready to open a pull request, add a note to the corresponding table cell in https://github.com/openvinotoolkit/openvino.genai/blob/master/src/docs/SUPPORTED_MODELS.md#openvino-genai-supported-models.

rk119 commented 4 months ago

Hi @Wovchena, I have a couple of things to address while verifying the chatglm-3-6b model with the samples.

  1. beam_search_causal_lm: This sample runs perfectly when export the model with the command optimum-cli export openvino --trust-remote-code --model THUDM/chatglm3-6b chatglm3-6b-with-past --task text-generation-with-past and not with text-generation as the task since it gives the following error Exception from src/inference/src/cpp/infer_request.cpp:79: Check '::getPort(port, name, {_impl->get_inputs(), _impl->get_outputs()})' failed at src/inference/src/cpp/infer_request.cpp:79: Port for tensor name beam_idx was not found.

  2. chat_sample: Same case as beam_search_causal_lm.

  3. continuous_batching_accuracy: Since there was no README with example commands, I went through the .cpp file and ran the following command first continuous_batching_accuracy --model ../../TinyLlama-1.1B-Chat-v1.0/ which gave a successful output. However, for chatglm3-6b model exported with task-generation gave the error Check '!variables.empty()' failed at C:\Users\riffa\OneDrive\Desktop\openvino.genai\src\cpp\src\paged_attention_transformations.cpp:21: Model is supposed to be stateful and with the task task-generation-with-past gave the error Check 'unregistered_parameters.str().empty()' failed at src/core/src/model.cpp:285: Model references undeclared parameters: opset1::Parameter beam_idx () -> (i32[?]).

  4. continuous_batching_benchmark: No README with instructions. Ran the following command continuous_batching_benchmark --model ../../TinyLlama-1.1B-Chat-v1.0/ to check for the execution but received an error of Check 'json_file.is_open()' failed at C:\Users\riffa\OneDrive\Desktop\openvino.genai\samples\cpp\continuous_batching_benchmark\continuous_batching_benchmark.cpp:85: Cannot open dataset file. After navigating through the file, I was unable to locate the dataset in my directory:

missing-dataset

  1. greedy_causal_lm: Same as beam_search_causal_lm.

  2. multinomial_causal_lm: Same as beam_search_causal_lm.

  3. prompt_lookup_decoding_lm: Works for TinyLlama-1.1B-Chat-v1.0/ but for chatglm3-6b with the task text-generation-with-past, it provides with the error Check 'new_seq_len <= old_seq_len' failed at C:\Users\riffa\OneDrive\Desktop\openvino.genai\samples\cpp\prompt_lookup_decoding_lm\prompt_lookup_decoding_lm.cpp:83

  4. speculative_decoding_lm: Could not export the example model meta-llama/Llama-2-7b-chat-hf as specified in the README to run with TinyLlama-1.1B-Chat-v1.0 due to authentication and access issues, any suggestions on how deal with it?

Note: TinyLlama-1.1B-Chat-v1.0 was loaded by default with the task feature-extraction for all the cases above.

Wovchena commented 4 months ago

1, 2, 3, 5, 6, 7 - use --task text-generation-with-past.

  1. Use https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/blob/main/ShareGPT_V3_unfiltered_cleaned_split.json
  2. @as-suvorov, can you help?
rk119 commented 4 months ago

Hi @Wovchena,

For 3 and 4, chatglm3-6b does not work with --task text-generation-with-past. Would be ideal if feature-extraction task was supported to validate.

3 and 4 work for the TinyLlama model with the feature-extraction task.

7 still gives an error as well as specified in the previous with --task text-generation-with-past.

as-suvorov commented 4 months ago
  1. meta-llama/Llama-2-7b-chat-hf requires license agreement acceptance. In order to download this model you need to log in at HuggingFace web site and accept license agreement on model card page. Then on a machine you working on you need to provide your access token. You can set it as a environment variable HF_TOKEN. Please tell me if this solved the issue.
Wovchena commented 4 months ago

@eaidova, can you comment on

For 3 and 4, chatglm3-6b does not work with --task text-generation-with-past. Would be ideal if feature-extraction task was supported to validate.

  1. I mixed the numbers. @as-suvorov, please help with 7.
  2. Since your ticket is about chatglm3-6b you shouldn't care about any other models. Use chatglm for speculative_decoding_lm. You can use any other model from the same chatglm family as a second model for the sample.
rk119 commented 4 months ago

@Wovchena For 8, I was trying to first run the example provided in the README in respective sample. I am getting an error of Check 'vocab_size == main_model.get_tensor("logits").get_shape().back()' failed at C:\Users\riffa\OneDrive\Desktop\openvino.genai\samples\cpp\speculative_decoding_lm\speculative_decoding_lm.cpp:268: vocab size should be the same for the both models when I run speculative_decoding_lm with TinyLlama and chatglm3-6b.

Wovchena commented 4 months ago

I see now. It's not possible to mix models from different families because they use different sets of tokens. To verify that the sample works in your setup you can use TinyLlama-1.1B-Chat-v1.0 as both input models. Once you get access to Llama-2-7b-chat-hf you can follow the readme as is. After that you need to find another model from chatglm family to use with chatglm3-6b.

as-suvorov commented 4 months ago
  1. Yeah, I started to look into that. I don't have fast solution here. I need some time to check chatglm3-6b.
Wovchena commented 4 months ago

3, 4 - what's the error message for --task text-generation-with-past? I can only see the logs for --task text-generation in https://github.com/openvinotoolkit/openvino.genai/issues/268#issuecomment-2241198347

rk119 commented 4 months ago

I see now. It's not possible to mix models from different families because they use different sets of tokens. To verify that the sample works in your setup you can use TinyLlama-1.1B-Chat-v1.0 as both input models. Once you get access to Llama-2-7b-chat-hf you can follow the readme as is. After that you need to find another model from chatglm family to use with chatglm3-6b.

Alright yes, makes sense. Thank you for clarifying and explaining.

rk119 commented 4 months ago

3, 4 - what's the error message for --task text-generation-with-past? I can only see the logs for --task text-generation in #268 (comment)

For 3 and 4 I receive the same error: Check 'unregistered_parameters.str().empty()' failed at src/core/src/model.cpp:285: Model references undeclared parameters: opset1::Parameter beam_idx () -> (i32[?])

Wovchena commented 4 months ago

Do I understand correctly, that you resolved the problem with beam_search_causal_lm, but continuous_batching_accuracy fails? That's strange because they both rely on beam_idx parameter. Can you verify, that you use the same model for these two samples?

rk119 commented 4 months ago

Do I understand correctly, that you resolved the problem with beam_search_causal_lm, but continuous_batching_accuracy fails? That's strange because they both rely on beam_idx parameter. Can you verify, that you use the same model for these two samples?

Yes I am.

Error

Wovchena commented 4 months ago

If that's going to be the only found problem, please, add a note to https://github.com/openvinotoolkit/openvino.genai/blob/master/src/docs/SUPPORTED_MODELS.md stating

THUDM/chatglm3-6b works for the default LLMPipeline backend but it's know to fail with Check 'unregistered_parameters.str().empty()' failed at src/core/src/model.cpp:285: Model references undeclared parameters: opset1::Parameter beam_idx () -> (i32[?]) for continuous_batching backend.

Wovchena commented 4 months ago

I extended to common ticket https://github.com/openvinotoolkit/openvino.genai/issues/259 with a new step:

  1. Extend the nightly_models list https://github.com/openvinotoolkit/openvino.genai/blob/c86fd779d49998a7fa2d5f0f25b2964654d1be25/tests/python_tests/ov_genai_test_utils.py#L22 with the given model and run nightly tests for that model: https://github.com/openvinotoolkit/openvino.genai/tree/master/tests/python_tests#customise-tests-run. Report if there are failing tests and comment out the model in the nightly_models list. Add this change to your pull request (PR).
rk119 commented 4 months ago

@Wovchena I went through the GitHub actions causal_lm_cpp.yml file and couldn't see any tests for the other supported models except for one which made me wonder if I should add for chatglm3-6b or am I looking at the right file or how exactly should I proceed with task 3 and 4?

Wovchena commented 4 months ago

If you switch to master you find more models.

I'd expect a 6b model to be too big to fit ubuntu-20.04 runner, so you are likely to need a runner with more memory. Such runners aren't available in forks and this statement becomes not applicable:

Since default runners are available for everyone, one can verify the test passes by opening a PR in their fork first.

When you open a PR to upstream, we need to trigger builds manually.

mlukasze commented 2 months ago

hey @rk119 will you work on this?

rk119 commented 2 months ago

hey @rk119 will you work on this?

Hi, unfortunately I am busy at the moment with other commitments hence I will unassign myself from this issue.

Aniruddha521 commented 1 month ago

.take

github-actions[bot] commented 1 month ago

Thank you for looking into this issue! Please let us know if you have any questions or require any help.

Aniruddha521 commented 1 month ago

I have reviewed all the resources and the contributor guide provided, but I have a simple question about adding a test script for the modification I made. Could you please share any resources that offer more detailed information on writing a test script for this task? Additionally, how can we effectively test models with large sizes that are difficult to fit in local storage? As I am a beginner in open source, I kindly request your guidance on this

Wovchena commented 1 month ago

https://github.com/openvinotoolkit/openvino.genai/issues/259 (5) is all we have about tests extension. optimum-cli offers cmd args for quantization to reduce the model size. This may help to fit the model into memory at runtime, although you still need a larger host to run the quantization, not sure if it helps in your case.

Aniruddha521 commented 1 month ago

@Wovchena Thank you for your guidance.

Aniruddha521 commented 1 month ago

Hey @Wovchena I have build and configure cmake(Release) but when I am executing ./beam_search_causal_lm TinyLlama-1.1B-Chat-v1.0 "Why is the Sun yellow?" I am getting the below error

Image