-
### Problem Statement
Nowadays remote model servers like AWS SageMaker, BedRock, or OpenAI, Cohere, etc all support batch predict APIs, which allow users to send large amount of synchronous request…
-
Hi,
I am receiving the following warning on my custom model:
"**WARNING: Adaptive Batching is enabled for model 'models' but not supported for inference streaming. Falling back to non-batched in…
-
Hi there! Great work!
Is it possible to run a batched inference?
Thanks!
-
I have implemented an inference API using ONNX Runtime and FastAPI to process multiple prompts in batches, with the goal of improving efficiency. However, I've observed that performance is significant…
-
I would like to perform batch inference. Can you please point me some resources or provide support for it? Thanks a lot
-
So we're having issues inferencing efficiently at scale, and of course we're processing the audio parts one by one as is default for inference, but is there any support for batch inference to speed th…
-
### System Info
x85-64
4 A10
0.9.0
### Who can help?
_No response_
### Information
- [X] The official example scripts
- [ ] My own modified scripts
### Tasks
- [ ] An officially supported tas…
-
### Checklist
- [ ] I have searched related issues but cannot get the expected help.
- [ ] 2. I have read the [FAQ documentation](https://github.com/open-mmlab/mmdeploy/tree/main/docs/en/faq.md) but …
-
## ❓ How to do something using detectron2
Currently, DensePose reads in single images and infer dense annotations. This is very slow and quite wasteful. Does DensePose have the ability to read in bat…
-
### Checklist
- [X] I have searched related issues but cannot get the expected help.
- [X] 2. I have read the [FAQ documentation](https://github.com/open-mmlab/mmdeploy/tree/main/docs/en/faq.md) but …