-
Running on 1xH100 with latest docker container from docker hub
```
>>> fast_pipe = optimum_pipeline('text-generation', 'meta-llama/Meta-Llama-3-8B-Instruct', use_fp8=True)
Special tokens have bee…
-
**Description**
According to the Framework matrix (https://docs.nvidia.com/deeplearning/frameworks/support-matrix/index.html#framework-matrix-2024), 24.05 is supposed to support TensorRT 10.0.6.1. Th…
-
### Search before asking
- [X] I have searched the YOLOv8 [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussions) and f…
-
https://developer.nvidia.com/nvidia-tensorrt-8x-download
https://blog.csdn.net/sdhdsf132452/article/details/130136330
@lix19937
-
### System Info
- CPU architecture: x86_64
- CPU memory size: 128G
- GPU name: NVIDIA GeForce GTX 1660S
- GPU memory size: 6G
- TensorRT-LLM branch: main
- TensorRT-LLM commit: 9691e12
- Contai…
-
![捕获](https://github.com/cumulo-autumn/StreamDiffusion/assets/35084983/be8f521c-15c9-40e7-8b83-9ada04cab03b)
-
### System Info
- CPU architecture x86_64
- Host memory size 32Gb
- GPU Nvidia RTX 2060
- GPU memory size 12 Gb
- TensorRT-LLM v0.10.0
### Who can help?
_No response_
### Information
- [ ] Th…
-
### System Info
tensorrt-llm version 0.11.0.dev2024062500
Architecture: x86_64
AMD EPYC 9354 32-Core Processor
``` txt
+----------------------------------------------------------…
-
I am trying to run the benchmarking on an Nvidia Orin 64GB machine due to lack of GPU resources, but it is too slow, so I would appreciate it if you could apply TensorRT-LLM. 🤣
-
### System Info
- CPU architecture : x86_64
- CPU/Host memory size : 32 GB
- GPU name L4 at g2-standard-8 (GCP)
- GPU memory size 24GB
- TensorRT-LLM branch or tag (e.g., main, v0.10.0)
- Nvi…