huggingface / optimum-nvidia

Apache License 2.0
867 stars 86 forks source link

Feature request: streamer #60

Open RomanKoshkin opened 8 months ago

RomanKoshkin commented 8 months ago

Can you please add support for streamers in the .generate() function?

Exception in thread Thread-2 (llm_loop):
Traceback (most recent call last):
  File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "app_llama4.py", line 333, in llm_loop
    output = model.generate(**generation_kwargs)
TypeError: TensorRTForCausalLM.generate() got an unexpected keyword argument 'streamer'