How to install optimum-nvidia properly without building a docker image

It's quite hard for me to build a docker image, so I started from a docker environment with TensorRT LLM 0.6.1 inside.

I checked your dockerfile, followed the process, and built TensorRT LLM using (I am using 4090 so that cuda arch is 89):

python3 scripts/build_wheel.py -j --trt_root /usr/local/tensorrt --python_bindings --cuda_architectures="89-real" --clean

Afterwards, I copied the resulting bindings*.so into tensorrt_llm's directory inside the dist-packages dir -- according to the dockerfile. Then I followed it to install nvidia-ammo 0.3, then added the optimum-nvidia dir to python path.

I also went into optimum-nvidia directory, and ran pip install -e ., so that in my environment, when using pip list | grep optimum I could get:

optimum                        1.17.1
optimum-nvidia                 0.1.0b2           /root/autodl-tmp/optimum-nvidia

However, I still could not import optimum.nvidia properly, while it's okay to import tensorrt_llm and tensorrt_llm.bindings.

>>> from optimum.nvidia.pipelines import pipeline
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'optimum.nvidia'
>>>

Could someone please help me on how to install optimum nvidia properly without building a new image or pulling from dockerhub?

Thank you!

huggingface / optimum-nvidia

How to install optimum-nvidia properly without building a docker image #76