Open imadoualid opened 1 month ago
Can you retry the latest version following the linux installation? I tried and it works fine.
@Superjomn without docker ? what torch version u're using ?
Here is the latest installation instructions, it should work without docker on Ubuntu, please have a try. @imadoualid
I successfully reinstalled using Docker according to the install instruction, and the following code runs without issues:
from tensorrt_llm import LLM, SamplingParams
llm = LLM(model="TinyLlama/TinyLlama-1.1B-Chat-v1.0")
prompts = ["Explain quantum mechanics."]
sampling_params = SamplingParams(temperature=0.8, top_p=0.95)
outputs = llm.generate(prompts, sampling_params)
If you still encounter problems, the issue might be related to the installation process.
same error, non docker env, installed by following commands.
mamba create -p conda_env python=3.10
mamba activate ./conda_env
pip install torch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 --index-url https://download.pytorch.org/whl/cu121
pip install tensorrt_llm --extra-index-url https://pypi.nvidia.com
with or without --pre get same error
Edited at 20241031
dont install torch manually, just install tensorrt_llm, it works.
@laikhtewari same issue occurs in nvcr.io/nvidia/tritonserver:24.10-trtllm-python-py3 container
System Info
System Information:
GPU Properties:
Libraries:
Container Information:
Operating System (OS):
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
i've installed TensorRT-LLM on a conda env following the doc installation
Install dependencies, TensorRT-LLM requires Python 3.10
apt-get update && apt-get -y install python3.10 python3-pip openmpi-bin libopenmpi-dev git git-lfs
Install the latest preview version (corresponding to the main branch) of TensorRT-LLM.
If you want to install the stable version (corresponding to the release branch), please
remove the
--pre
option.pip3 install tensorrt_llm -U --pre --extra-index-url https://pypi.nvidia.com
Check installation
python3 -c "import tensorrt_llm"
llm = LLM(model="TinyLlama/TinyLlama-1.1B-Chat-v1.0") prompts = ["Explain quantum mechanics."] sampling_params = SamplingParams(temperature=0.8, top_p=0.95) outputs = llm.generate(prompts, sampling_params)
i'am getting this error
Expected behavior
the example to work
actual behavior
AttributeError: '_SyncQueue' object has no attribute 'get'
Processed requests: 0%| | 0/4 [00:00<?, ?it/s]
AttributeError Traceback (most recent call last) Cell In[4], line 11 7 sampling_params = SamplingParams(temperature=0.8, top_p=0.95) 9 llm = LLM(model="TinyLlama/TinyLlama-1.1B-Chat-v1.0") ---> 11 outputs = llm.generate(prompts, sampling_params)
File ~/miniconda3/envs/trt_llm_env/lib/python3.10/site-packages/tensorrt_llm/hlapi/llm.py:211, in LLM.generate(self, inputs, sampling_params, use_tqdm, lora_request) 205 futures.append(future) 207 for future in tqdm(futures, 208 desc="Processed requests", 209 dynamic_ncols=True, 210 disable=not use_tqdm): --> 211 future.result() 213 if unbatched: 214 futures = futures[0]
File ~/miniconda3/envs/trt_llm_env/lib/python3.10/site-packages/tensorrt_llm/executor.py:316, in GenerationResult.result(self, timeout) 314 def result(self, timeout: Optional[float] = None) -> "GenerationResult": 315 while not self._done: --> 316 self.result_step(timeout) 317 return self
File ~/miniconda3/envs/trt_llm_env/lib/python3.10/site-packages/tensorrt_llm/executor.py:306, in GenerationResult.result_step(self, timeout) 305 def result_step(self, timeout: Optional[float] = None): --> 306 response = self.queue.get(timeout=timeout) 307 self.handle_response(response)
AttributeError: '_SyncQueue' object has no attribute 'get' #
additional notes
tried also gemma got the same error