openvinotoolkit / openvino.genai

Run Generative AI models using native OpenVINO C++ API
Apache License 2.0
77 stars 116 forks source link

[Good First Issue]: Verify mpt-7b-chat with GenAI text_generation #267

Open p-wysocki opened 4 months ago

p-wysocki commented 4 months ago

Context

This task regards enabling tests for mpt-7b-chat. You can find more details under openvino_notebooks LLM chatbot README.md.

Please ask general questions in the main issue at https://github.com/openvinotoolkit/openvino.genai/issues/259

What needs to be done?

Described in the main Discussion issue at: https://github.com/openvinotoolkit/openvino.genai/issues/259

Example Pull Requests

Described in the main Discussion issue at: https://github.com/openvinotoolkit/openvino.genai/issues/259

Resources

Contact points

Described in the main Discussion issue at: https://github.com/openvinotoolkit/openvino.genai/issues/259

Ticket

No response

qxprakash commented 4 months ago

.take

github-actions[bot] commented 4 months ago

Thank you for looking into this issue! Please let us know if you have any questions or require any help.

Utkarsh-2002 commented 4 months ago

.take

github-actions[bot] commented 4 months ago

Thanks for being interested in this issue. It looks like this ticket is already assigned to a contributor. Please communicate with the assigned contributor to confirm the status of the issue.

p-wysocki commented 4 months ago

Hello @qxprakash, are you still working on this? Is there anything we could help you with?

qxprakash commented 4 months ago

hello @p-wysocki yes I am working on it , I faced error while trying to run mpt-chat

python3 ../../../llm_bench/python/convert.py --model_id mosaicml/mpt-7b-chat --output_dir ./MPT_CHAT --precision FP16


[ INFO ] Removing bias from module=LPLayerNorm((4096,), eps=1e-05, elementwise_affine=True).
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [01:02<00:00, 31.27s/it]
generation_config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 121/121 [00:00<00:00, 781kB/s]
/home/prakash/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-7b-chat/1fe2374291e730f7c58ceb1bf49960082371b551/attention.py:87: UserWarning: Propagating key_padding_mask to the attention module and applying it within the attention module can cause unnecessary computation/memory usage. Consider integrating into attn_bias once and passing that to each attention module instead.
  warnings.warn('Propagating key_padding_mask to the attention module ' + 'and applying it within the attention module can cause ' + 'unnecessary computation/memory usage. Consider integrating ' + 'into attn_bias once and passing that to each attention ' + 'module instead.')
/home/prakash/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-7b-chat/1fe2374291e730f7c58ceb1bf49960082371b551/modeling_mpt.py:311: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert S <= self.config.max_seq_len, f'Cannot forward input with seq_len={S}, this model only supports seq_len<={self.config.max_seq_len}'
/home/prakash/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-7b-chat/1fe2374291e730f7c58ceb1bf49960082371b551/modeling_mpt.py:253: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  _s_k = max(0, attn_bias.size(-1) - s_k)
/home/prakash/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-7b-chat/1fe2374291e730f7c58ceb1bf49960082371b551/attention.py:78: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  _s_q = max(0, attn_bias.size(2) - s_q)
/home/prakash/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-7b-chat/1fe2374291e730f7c58ceb1bf49960082371b551/attention.py:79: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  _s_k = max(0, attn_bias.size(3) - s_k)
/home/prakash/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-7b-chat/1fe2374291e730f7c58ceb1bf49960082371b551/attention.py:81: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attn_bias.size(-1) != 1 and attn_bias.size(-1) != s_k or (attn_bias.size(-2) != 1 and attn_bias.size(-2) != s_q):
/home/prakash/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-7b-chat/1fe2374291e730f7c58ceb1bf49960082371b551/attention.py:89: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if is_causal and (not q.size(2) == 1):
/home/prakash/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-7b-chat/1fe2374291e730f7c58ceb1bf49960082371b551/attention.py:90: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  s = max(s_q, s_k)
[ WARNING ] Failed to send event with the following error: <urlopen error EOF occurred in violation of protocol (_ssl.c:2426)>
[ WARNING ] Failed to send event with the following error: <urlopen error EOF occurred in violation of protocol (_ssl.c:2426)>```
p-wysocki commented 4 months ago

cc @pavel-esir

qxprakash commented 4 months ago

@pavel-esir I hope this is not a memory issue ?

pavel-esir commented 4 months ago

@qxprakash thanks for your update. I see only connections warning in your logs. Did you get the converted IR? If so, did you run it with c++ sample?

tushar-thoriya commented 3 months ago

@qxprakash what is the progress ?

p-wysocki commented 2 months ago

Hello @qxprakash, are you still working on that issue? Do you need any help?

qxprakash commented 1 month ago

Hi @p-wysocki I was stuck , currently I'm not working on it.