-
# TensorRT Model Optimizer - Product Roadmap
[TensorRT Model Optimizer](https://github.com/NVIDIA/TensorRT-Model-Optimizer) (ModelOpt)’s north star is to be the best-in-class model optimization toolki…
-
**Describe the bug**
OpenAI API endpoint is "/v1/chat/completions", but OVMS endpoint is "/v3/chat/completions".
most of existing application doesn't allow user to modify the prefix “**V1**” to "**…
-
# URL
- https://arxiv.org/pdf/2408.02666
# Affiliations
- Tianlu Wang, N/A
- Ilia Kulikov, N/A
- Olga Golovneva, N/A
- Ping Yu, N/A
- Weizhe Yuan, N/A
- Jane Dwivedi-Yu, N/A
- Richard Yu…
-
### Checklist
- [X] I have used the search function to see if someone else has already submitted the same feature request.
- [X] I will describe the problem with as much detail as possible.
- [X] Thi…
-
Hi all,
We've recently open-sourced a new quantization method. VPTQ (Vector Post-Training Quantization) is a novel Post-Training Quantization method that leverages Vector Quantization to achieve hi…
-
## 📚 Documentation
Create an example on how to train a small LLM.
Add it to the examples directory here:
https://github.com/pytorch/xla/tree/master/examples
-
**Describe the bug**
What the bug is, and how to reproduce, better with screenshots(描述bug以及复现过程,最好有截图)
I encountered an issue while training Qwen2VL with Flash Attention enabled. When the training…
-
I wanted to start over from checkpoint because an issue occurred during mntp learning and it was interrupted.
However, when I resumed learning, I received the following message that there was no inde…
yallk updated
1 month ago
-
Original Repository: https://github.com/ml-explore/mlx-examples/
Listing out examples from there which would be nice to have. We don't expect the models to work out the moment they are translated to …
-
Hi :) really really interested in this topic, looking forward for documentation.
Thanks for sharing! 🙏