Open ncomly-nvidia opened 6 months ago
Please add CohereAI!!
CohereForAI/c4ai-command-r-plus
Llama 3 would be great (both 8B and 70B): https://github.com/NVIDIA/TensorRT-LLM/issues/1470
Maybe quantized to 8 or even 4 bit.
currently llama 3 throws a bunch of errors converting to TensorRT LLM
any ideal about the support for llama 3
Phi-3-mini should be amazing! Such a small 3.8B model could run quantized on many GPUs, with as little as 4GB VRAM.
+1 for Phi-3
+1 for Command R Plus!
CohereForAI/c4ai-command-r-plus
hello @ncomly-nvidia, I am a student interested in the project! I want to ask if there are any good-first-issue feature request for Features & Optimizations recently? 🤣
+1 for OpenBMB/MiniCPM-V-2
Hi all, this issue will track the feature requests you've made to TensorRT-LLM & provide a place to see what TRT-LLM is currently working on.
Last update:
Jan 14th, 2024
🚀 = in developmentModels
Decoder Only
Encoder / Encoder-Decoder
Multi-Modal
Other
Features & Optimizations
KV Cache
Quantization
Sampling
frequnecy_penalty
- #275repetition
&presence
penalties - #274Workflow
Front-ends
Integrations
Usage / Installation
Platform Support