-
### 🚀 The feature
How to deploy a model service that spans multiple GPUs?
### Motivation, pitch
I have a large model which I run via `torchrun`. I use the **FairScale** library to distribute the mo…
-
I exported a pytorch (model.pt) model to ONNX:
```
def to_numpy(tensor):
return tensor.detach().cpu().numpy() if tensor.requires_grad else tensor.cpu().numpy()
torch_model = torch.load(os.pa…
-
### Checklist
- [X] 1. I have searched related issues but cannot get the expected help.
- [X] 2. The bug has not been fixed in the latest version.
- [X] 3. Please note that if the bug-related issue y…
-
Create a simple Flask app with the following endpoints:
- [ ] **/predict** - predict the shot’s probability of being a goal given the inputs
- Input: the input features to your model, compatible wit…
-
* research key serving frameworks that are optimized for diffusion models
* typically diffusion models take a while to compute even with gpus so see if we can figure out a way to test and deploy them…
-
Hi, Thanks for the great article.
Can you help me how we can save the estimator for serving a purpose?
-
Hello Guys,
First of all, I would like to thank everyone for this amazing work. If I want to use the jasper model for tensorFlow serve.. How should I use the datalayer code for extracting features …
-
**Describe the bug**
It's frustrating to run i.e. `ilab -v data generate` only to get:
```
...
DEBUG 2024-09-12 13:23:22,053 instructlab.model.backends.vllm:205: vLLM serving command is: ['/opt/…
-
### Issues Policy acknowledgement
- [X] I have read and agree to submit bug reports in accordance with the [issues policy](https://www.github.com/mlflow/mlflow/blob/master/ISSUE_POLICY.md)
### W…
-
https://arxiv.org/pdf/1905.13348.pdf
https://dl.acm.org/citation.cfm?id=3321443