Open SehajDxstiny opened 9 months ago
I tried main branch with your above steps and the engine could be built succesfully.
Could you please check /root/TensorRT-LLM/tensorrt_llm/models/gpt/model.py line 221 ?
I suppose you're using main branch code.
I tried this too, and the same error. Where are running this from? Do you run this from inside the repository? I installed tensorRT-LLM using linux installation commands:
# Install dependencies, TensorRT-LLM requires Python 3.10
apt-get update && apt-get -y install python3.10 python3-pip openmpi-bin libopenmpi-dev
# Install the latest version of TensorRT-LLM
pip3 install tensorrt_llm -U --extra-index-url https://pypi.nvidia.com
# Check installation
python3 -c "import tensorrt_llm; print(tensorrt_llm.__version__)"
I tried again, doesn't work. line at 221:
if position_embedding_type == PositionEmbeddingType.learned_absolute:
self.position_embedding = Embedding(max_position_embeddings,
hidden_size,
dtype=dtype)
edit: any thought? @nv-guomingz
I tried again, doesn't work. line at 221:
if position_embedding_type == PositionEmbeddingType.learned_absolute: self.position_embedding = Embedding(max_position_embeddings, hidden_size, dtype=dtype)
edit: any thought? @nv-guomingz
Did u run the gpt example on docker built with doc instructions?
Since I can't reproduce your issue locally, I suggest you add debug code to check tensorrt_llm_gpt has position_embedding or not
@nv-guomingz this is how installed TensorRT-LLM (as given in installation guide for linux in the repo):
# Install dependencies, TensorRT-LLM requires Python 3.10
apt-get update && apt-get -y install python3.10 python3-pip openmpi-bin libopenmpi-dev
# Install the latest version of TensorRT-LLM
pip3 install tensorrt_llm -U --extra-index-url https://pypi.nvidia.com
# Check installation
python3 -c "import tensorrt_llm; print(tensorrt_llm.__version__)"
@nv-guomingz this is how installed TensorRT-LLM (as given in installation guide for linux in the repo):
# Install dependencies, TensorRT-LLM requires Python 3.10 apt-get update && apt-get -y install python3.10 python3-pip openmpi-bin libopenmpi-dev # Install the latest version of TensorRT-LLM pip3 install tensorrt_llm -U --extra-index-url https://pypi.nvidia.com # Check installation python3 -c "import tensorrt_llm; print(tensorrt_llm.__version__)"
@SehajDxstiny Your installation command will install version 0.7.1, which differs from the main branch. You should try using the examples/gpt files in the 0.7.1 branch.
In my case, the issue was resolved by replacing only four files: weight.py, build.py in ~/examples/gpt, and run.py, utils.py in ~/examples. If the issue persists, I believe replacing all your example code with version 0.7.1 should resolve the problem.
i ran all the steps given to run gpt2m here: https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/gpt, specifically:
rm -rf gpt2 && git clone https://huggingface.co/gpt2-medium gpt2
pushd gpt2 && rm pytorch_model.bin model.safetensors && wget -q https://huggingface.co/gpt2-medium/resolve/main/pytorch_model.bin && popd
python3 hf_gpt_convert.py -i gpt2 -o ./c-model/gpt2 --tensor-parallelism 1 --storage-type float16
after that I ran this command:
python3 build.py --model_dir=./c-model/gpt2/1-gpu --use_gpt_attention_plugin --remove_input_padding
I get an error: