turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs
MIT License
3.19k stars 234 forks source link

ExLlamaV2StreamingGenerator error #451

Closed nktice closed 1 month ago

nktice commented 1 month ago

Updating my guide, I'm now getting this error, so I thought I'd write... https://github.com/nktice/AMD-AI/blob/main/ROCm6.0.md

With ExLlamaV2 0.0.21 - The model loads fine, but when I try a query, I get this error - [ I checked another loader, and Oobabooga TGW's built in Exllamav2_HF works fine, and answers queries... ]

Traceback (most recent call last):
  File "/home/n/text-generation-webui/modules/text_generation.py", line 425, in generate_reply_custom
    for reply in shared.model.generate_with_streaming(question, state):
  File "/home/n/text-generation-webui/modules/exllamav2.py", line 144, in generate_with_streaming
    chunk, eos, _ = self.generator.stream()
                    ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/n/miniconda3/envs/textgen/lib/python3.11/site-packages/exllamav2/generator/streaming.py", line 476, in stream
    chunk, eos, chunk_token_ids, probs, _, _, logits, _ = self._stream()
                                                          ^^^^^^^^^^^^^^
  File "/home/n/miniconda3/envs/textgen/lib/python3.11/site-packages/exllamav2/generator/streaming.py", line 617, in _stream
    if self.stop_strings_utf32_offsets is not None:
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'ExLlamaV2StreamingGenerator' object has no attribute 'stop_strings_utf32_offsets'
Output generated in 0.06 seconds (0.00 tokens/s, 0 tokens, context 64, seed 1666915439)
waterangel91 commented 1 month ago

I have this exact error also. End up downgraded back to v0.20

turboderp commented 1 month ago

Yeah, it's a regression. I overlooked the case where the streaming generator is used without stop conditions. I'll have a fix out soon. In the meantime, if you call generator.set_stop_conditions([]) it should initialize properly and work as before.

waterangel91 commented 1 month ago

Thank u very much for sharing the temporary fix.

nktice commented 1 month ago

I tried new version, and it appears to have resolved the issue above. Thank you for your work... it looks like everything works now as one would expect.