NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
https://nvidia.github.io/TensorRT-LLM
Apache License 2.0
8.57k stars 972 forks source link

Why is the performance worse than release 0.12.0 when I run the benchmark of release 0.13.0 #2395

Open rexmxw02 opened 4 days ago

rexmxw02 commented 4 days ago

System Info

Why is the performance worse than release 0.12.0 when I run the benchmark of release 0.13.0 (fp16,fp8,int8) llama2 70b

Who can help?

No response

Information

Tasks

Reproduction

Why is the performance worse than release 0.12.0 when I run the benchmark of release 0.13.0 (fp16,fp8,int8) llama2 70b

Expected behavior

Why is the performance worse than release 0.12.0 when I run the benchmark of release 0.13.0 (fp16,fp8,int8) llama2 70b

actual behavior

Why is the performance worse than release 0.12.0 when I run the benchmark of release 0.13.0 (fp16,fp8,int8) llama2 70b

additional notes

Why is the performance worse than release 0.12.0 when I run the benchmark of release 0.13.0 (fp16,fp8,int8) llama2 70b

hello-11 commented 4 days ago

Could you give us more details about the performance and what's your hardware? @rexmxw02

rexmxw02 commented 4 days ago

nvidia gpu h20 and a800

thalapandi commented 4 days ago

OSError: [WinError 126] The specified module could not be found. Error loading "C:\Users\Admin\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\lib\fbgemm.dll" or one of its dependencies.

when install tenosrrt llm in windows from below

https://nvidia.github.io/TensorRT-LLM/installation/windows.html

so how can i resolve this issue?

hello-11 commented 3 days ago

@rexmxw02 Could you try to use the newest version?

hello-11 commented 3 days ago

@thalapandi Thanks for your interest. You can submit a separate issue and provide us with more details.

thalapandi commented 3 days ago

Environment path found: C:\Users\jayme\AppData\Local\NVIDIA\ChatWithRTX\env_nvd_rag App running with config { "models": { "supported": [ { "name": "Mistral 7B int4", "installed": true, "metadata": { "model_path": "model\mistral\mistral7b_int4_engine", "engine": "llama_float16_tp1_rank0.engine", "tokenizer_path": "model\mistral\mistral7b_hf", "max_new_tokens": 1024, "max_input_token": 7168, "temperature": 0.1 } }, { "name": "Llama 2 13B int4", "installed": true, "metadata": { "model_path": "model\llama\llama13_int4_engine", "engine": "llama_float16_tp1_rank0.engine", "tokenizer_path": "model\llama\llama13_hf", "max_new_tokens": 1024, "max_input_token": 3900, "temperature": 0.1 } } ], "selected": "Mistral 7B int4" }, "sample_questions": [ { "query": "How does NVIDIA ACE generate emotional responses?" }, { "query": "What is Portal prelude RTX?" }, { "query": "What is important about Half Life 2 RTX?" }, { "query": "When is the launch date for Ratchet & Clank: Rift Apart on PC?" } ], "dataset": { "sources": [ "directory", "youtube", "nodataset" ], "selected": "directory", "path": "dataset", "isRelative": true }, "strings": { "directory": "Folder Path", "youtube": "YouTube URL", "nodataset": "AI model default" } } Traceback (most recent call last): File "C:\Users\jayme\AppData\Local\NVIDIA\ChatWithRTX\env_nvd_rag\lib\site-packages\langchain\embeddings\huggingface.py", line 58, in init import sentence_transformers ModuleNotFoundError: No module named 'sentence_transformers'

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "C:\Users\jayme\AppData\Local\NVIDIA\ChatWithRTX\RAG\trt-llm-rag-windows-main\app.py", line 114, in embed_model = HuggingFaceEmbeddings(model_name=embedded_model) File "C:\Users\jayme\AppData\Local\NVIDIA\ChatWithRTX\env_nvd_rag\lib\site-packages\langchain\embeddings\huggingface.py", line 61, in init raise ImportError( ImportError: Could not import sentence_transformers python package. Please install it with pip install sentence-transformers. Press any key to continue . . .

especially got this ?

https://forums.developer.nvidia.com/t/chat-with-rtx-error-modulenotfounderror-no-module-named-sentence-transformers/283151 tried many things but could not resolve of an issue

rexmxw02 commented 3 days ago

@hello-11 master or release 0.13.0?

hello-11 commented 20 hours ago

@rexmxw02 please try the latest main branch.

hello-11 commented 20 hours ago

@thalapandi please submit another issue.