关于text-generation-webui调用，前面的兄弟，有网页版的

nobodybut commented 11 months ago

在huggingface上留言可能看不到，这里热闹一些：使用text-generation-webui加载Qwen/Qwen-7B-Chat模型的时候参数如图一所示（这台机器显卡太差，CPU较好），加载之后默认只能使用1个CPU线程（如图二），大量的CPU被闲置，然后推理速度非常非常慢，我查了你们开源的readme，没有看到启动参数调整的信息，请问我可以在哪里调整启动参数，使用更多的CPU用于推理呢，谢谢。 PS：Git从huggingface下载的时候默认会漏一个文件qwen.tiktoken，我不知道是不是我的特例。微信图片_20230804091731 4861308d0ae0fe62430e99d7cd6503f3

windkwbs commented 11 months ago

这个东西如何用GPU？

5102a commented 11 months ago

我报了这个错，有解法吗 Traceback (most recent call last): File "D:\hub\text-generation-webui\modules\callbacks.py", line 55, in gentask ret = self.mfunc(callback=_callback, args, self.kwargs) File "D:\hub\text-generation-webui\modules\text_generation.py", line 294, in generate_with_callback shared.model.generate(kwargs) File "C:\Users\10153/.cache\huggingface\modules\transformers_modules\Qwen_Qwen-7B-Chat\modeling_qwen.py", line 1051, in generate return super().generate( File "C:\Users\10153\anaconda3\envs\textgen\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(args, **kwargs) File "C:\Users\10153\anaconda3\envs\textgen\lib\site-packages\transformers\generation\utils.py", line 1296, in generate eos_token_id = eos_token_id[0]IndexError: list index out of range

nobodybut commented 11 months ago

这个东西默认用GPU，如果探测不到你的GPU，才使用CPU，我提这个就是因为CPU只能用到一个线程，太慢太慢了

nobodybut commented 11 months ago

我不是写了个PS，你没看啊骚年，PS：Git从huggingface下载的时候默认会漏一个文件qwen.tiktoken，我不知道是不是我的特例。你这个报错一看就是这个问题啊，手动去下载这个文件放到你的D:\hub\text-generation-webui\modules\Qwen_Qwen-7B-Chat\目录下就好了

5102a commented 11 months ago

这个文件我手动下载过了，还是报错，我得手动在这里添加，eos_token 和 eos_token_id 才行

nobodybut commented 11 months ago

所以你在windows下运行，能用满CPU么？还是只有一个CPU线程在跑？哦，也可能你直接用显卡，羡慕，告辞……

vinsonws commented 11 months ago

哥们能提供下具体方案么，我也遇到了这个问题。（比如果在哪个文件的哪一行添加什么样的代码？）

5102a commented 11 months ago

tokenization_qwen.py 文件 86 行添加下面2句 self.eos_token_id = self.eod_id self.eos_token = ENDOFTEXT

nobodybut commented 11 months ago

此问题已解决，hf上最新的拆分成小文件的模型，可以利用全部的CPU了

nobodybut commented 11 months ago

但是产生了新的问题，不能及时stop……回答问题后会继续胡言乱语，可能这个webui兼容性还是差一点什么……

5102a commented 11 months ago

没什么问题，有用 prompt 模板吗？

jklj077 commented 11 months ago

ehartford commented 10 months ago

I also get this.

Successfully loaded the model

But the model doesn't generate.

Traceback (most recent call last):
  File "/home/eric/git/text-generation-webui/modules/callbacks.py", line 55, in gentask
    ret = self.mfunc(callback=_callback, *args, **self.kwargs)
  File "/home/eric/git/text-generation-webui/modules/text_generation.py", line 307, in generate_with_callback
    shared.model.generate(**kwargs)
  File "/home/eric/.cache/huggingface/modules/transformers_modules/Qwen-7B-Chat/modeling_qwen.py", line 1095, in generate
    return super().generate(
  File "/home/eric/miniconda3/envs/textgen/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/eric/miniconda3/envs/textgen/lib/python3.10/site-packages/transformers/generation/utils.py", line 1391, in generate
    eos_token_id = eos_token_id[0]
IndexError: list index out of range
Output generated in 0.52 seconds (0.00 tokens/s, 0 tokens, context 25, seed 1460570118)

ehartford commented 10 months ago

this model expects ChatML prompt format. Like this:

<|im_start|>system
You are a helpful AI assistant.<|im_end|>
<|im_start|>user
Please tell me a story<|im_end|>
<|im_start|>assistant

suntinsion commented 10 months ago

I also get this.

Successfully loaded the model

But the model doesn't generate.

Traceback (most recent call last):
  File "/home/eric/git/text-generation-webui/modules/callbacks.py", line 55, in gentask
    ret = self.mfunc(callback=_callback, *args, **self.kwargs)
  File "/home/eric/git/text-generation-webui/modules/text_generation.py", line 307, in generate_with_callback
    shared.model.generate(**kwargs)
  File "/home/eric/.cache/huggingface/modules/transformers_modules/Qwen-7B-Chat/modeling_qwen.py", line 1095, in generate
    return super().generate(
  File "/home/eric/miniconda3/envs/textgen/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/eric/miniconda3/envs/textgen/lib/python3.10/site-packages/transformers/generation/utils.py", line 1391, in generate
    eos_token_id = eos_token_id[0]
IndexError: list index out of range
Output generated in 0.52 seconds (0.00 tokens/s, 0 tokens, context 25, seed 1460570118)

大佬，这个问题是怎么解决的

vinsonws commented 10 months ago

我按 @5102a 提供的方法确实能解决这个问题。

suntinsion commented 10 months ago

改了之后遇到另外的错误了： Traceback (most recent call last): File "/root/text-generation-webui-1.5/modules/callbacks.py", line 55, in gentask ret = self.mfunc(callback=_callback, *args, self.kwargs) File "/root/text-generation-webui-1.5/modules/text_generation.py", line 293, in generate_with_callback shared.model.generate(kwargs) File "/root/anaconda3/envs/gpu/lib/python3.9/site-packages/auto_gptq/modeling/_base.py", line 443, in generate return self.model.generate(*kwargs) File "/root/.cache/huggingface/modules/transformers_modules/Qwen-7B-Chat-Int4/modeling_qwen.py", line 1136, in generate return super().generate( File "/root/anaconda3/envs/gpu/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(args, **kwargs) File "/root/anaconda3/envs/gpu/lib/python3.9/site-packages/transformers/generation/utils.py", line 1580, in generate input_ids, model_kwargs = self._expand_inputs_for_generation( File "/root/anaconda3/envs/gpu/lib/python3.9/site-packages/transformers/generation/utils.py", line 725, in _expand_inputs_for_generation input_ids = input_ids.repeat_interleave(expand_size, dim=0) RuntimeError: Storage size calculation overflowed with sizes=[4387417344218020454]

jklj077 commented 10 months ago

Qwen-7B-Chat在text-generation-webui无法中止的兼容问题请见 #253 不建议修改qwen的tokenizer，如问题仍存在，请重开本issue，感谢支持！

ehartford commented 10 months ago

If you don't care about adoption, then you don't need to fix it.

majestichou commented 4 months ago

Thanks a lot!

linzm1007 commented 1 month ago

api 接口怎么stop

QwenLM / Qwen

关于text-generation-webui调用，前面的兄弟，有网页版的 #29