-
Hello, I want to deploy llama-3-8b quantized model using tritonserver I followed below steps to do this:
1. create container with nvcr.io/nvidia/tritonserver:24.06-trtllm-python-py3 base image.
3.…
-
Hi @NicoleCrockett & @KatieRussellTBE
We've done some research and found a batch converter from UK Grid reference finder.
We'd like you to use this tool to add a couple of extra fields - (lati…
-
It will be great if baml works with some batch inference tools like vLLM or add its own
-
The cmd/gorename tool long predates the LSP and gopls, and its functionality has since been subsumed by them. We should deprecate and delete the command, following the sequence used for go/pointer in …
-
I tried all the required dependent libraries installations and then cloned the SubSai repository. When I try to run the command "subsai-webui" on command prompt it says this batch file does not exist.…
-
### System Info
I am working on the benchmarking suite in vLLM team, and now trying to run TensorRT-LLM for comparison. I am relying on this github repo (https://github.com/neuralmagic/tensorrt-demo)…
-
I have unpacked the .pack file and edited the content inside, now I don't know how to repack it, please help me!!!!
-
Discuss a small (if possible size 3) SCC in GAV and GA graph (one each) **in detail** - i.e. look at the dependencies and try to understand what the developers try to achieve here.
Do the followi…
-
why numpy 1 / 2 ?
iv installed 2 LLMs and 4 image generators based on git python stuff, all runngin fine with cuda and tensor
what is that ?
(venv) h:\caption\joy-caption-batch>nvcc --version…
-
### System Info
- X86_64
- RAM: 30 GB
- GPU: A10G, VRAM: 23GB
- Lib: Tensorrt-LLM v0.9.0
- Container Used: nvcr.io/nvidia/tritonserver:24.05-trtllm-python-py3
- Model used: Mistral 7B
### …