-
### Willingness to contribute
The MLflow Community encourages bug fix contributions. Would you or another member of your organization be willing to contribute a fix for this bug to the MLflow code ba…
-
Hi, in previous issues, you wrote that you planed to integrate flashinfer into some inference backend like gpt-fast. This will be very interesting! And may I ask can we integrate Flashinfer into gpt-f…
-
### What is the issue?
Trying to load Mixtral 8x22B model using an A100 GPU as a deployment in Kubernetes, but it isn't loaded after 6 minutes.
Mistral 7B model is loaded fine.
Here is the de…
-
I tried to explore available approaches for distributed training of large-scale recommendation models with huge embedding tables and tried to use TFRA `DynamicEmbedding` combined with `MultiWorkerMirr…
-
Subscribe to this issue and stay notified about new [daily trending repos in Python](https://github.com/trending/python?since=daily)!
-
**Describe the bug**
I got an error trying to convert from a saved_model.pb builded with tensorflow 2.3.0:
`Traceback (most recent call last):
File "C:\Program Files\Python38\lib\runpy.py", lin…
Fax3D updated
1 month ago
-
Open issue to openly discuss potential ideas or improvements, whether on documentation, interfaces, examples, bug fixes, etc.
-
### Proposal to improve performance
Recently, vLLM has been conducting a lot of work related to Speculative Decoding, and we often see remarkable achievements.
For the Speculative Decoding algorit…
-
### 🐛 Describe the bug
Command i have used to create model file
`torch-model-archiver --model-name yolo_tiny --version 1.0 --model-file model.pth --handler handler.py
torchserve --start --m…
-
First of all, thank you very much for all effort put into this project. From what I have seen in the past couple of weeks investigating it I am really impressed by the state and performance of it!
…