neuralmagic / deepsparse

Sparsity-aware deep learning inference runtime for CPUs
https://neuralmagic.com/deepsparse/
Other
3.01k stars 176 forks source link

Use nullcontext in nl_decoder_engine when timer_manager.current is None #1277

Closed mgoin closed 1 year ago

mgoin commented 1 year ago

There is an error that seems to only happen when generating the second token when streaming=True, but only shows up in specific pipeline setups such as Gradio

Error:

  File "/home/user/.local/lib/python3.10/site-packages/deepsparse/transformers/engines/nl_decoder_engine.py", line 194, in __call__
    with timer.time(f"EXECUTE_ENGINE_SEQ_LEN_{self.sequence_length}"):
AttributeError: 'NoneType' object has no attribute 'time'

This PR simply uses nullcontext to replace the timer if it is None

mgoin commented 1 year ago

Resolved by https://github.com/neuralmagic/deepsparse/pull/1294