Open vonodiripsa opened 5 days ago
What version are you using? And can you share some usage code for us to reproduce it? @vonodiripsa
Could be a similar issue: https://github.com/NVIDIA/TensorRT-LLM/issues/2323. Are you using a Docker or non-Docker environment?
Latest TRT-LLM has a bug in TensorRT-LLM/tree/main/tensorrt_llm/executor.py
Issue: File /databricks/python/lib/python3.10/site-packages/tensorrt_llm/hlapi/llm.py:211, in LLM.generate(self, inputs, sampling_params, use_tqdm, lora_request) 205 futures.append(future) 207 for future in tqdm(futures, 208 desc="Processed requests", 209 dynamic_ncols=True, 210 disable=not use_tqdm): --> 211 future.result() 213 if unbatched: 214 futures = futures[0]
File /databricks/python/lib/python3.10/site-packages/tensorrt_llm/executor.py:328, in GenerationResult.result(self, timeout) 326 def result(self, timeout: Optional[float] = None) -> "GenerationResult": 327 while not self._done: --> 328 self.result_step(timeout) 329 return self
File /databricks/python/lib/python3.10/site-packages/tensorrt_llm/executor.py:318, in GenerationResult.result_step(self, timeout) 317 def result_step(self, timeout: Optional[float] = None): --> 318 response = self.queue.get(timeout=timeout) 319 self.handle_response(response)
AttributeError: '_SyncQueue' object has no attribute 'get'
Could you please add to _SyncQueue something like:
It is very critical, because our LLM customer demo is failing.