-
Is it possible to run inference in batches instead of one by one?
If so please suggest me some approach.
-
hi Ghustwb,
for batch processing of images, i changed BATCH_SIZE = 9 and const size_t size = width * height * sizeof(float3) * BATCH_SIZE and filled the unifrom_data accordingly (uniform_data[volIm…
-
**Description**
I implemented multi-instance inference across 4 A100 GPUS by following [this](https://triton-inference-server.github.io/pytriton/latest/binding_models/#multi-instance-model-inferenc…
-
For example, I have a lot of unpaired images from two domain, A and B.
After training, what should I do to transfer A to B with the model?
-
https://github.com/huggingface/text-generation-inference/blob/4dfdb481fb1f9cf31561c056061d693f38ba4168/router/src/infer/v3/queue.rs#L362
When max_size is 0, but the batch_requests.len() > 0,max_bat…
-
### Describe the issue
I try to run inference with batches to accelerate the process. However, I find it takes just as much time as inferring each instance individually.
-
I tried the batch inference in XTTS, So I am doing padding till the max text sequence in the batch and also adding the attention mask for this, But for shorter sequences,
I am getting some random…
-
### Describe the bug
When there's an exception happened, flyte will catch the error and the AWS batch job status goes into a SUCCEEDED state and the flyte AWS batch plugin reports catch the error b…
-
### System Info
Hi Team,
First of all huge thanks for all the great work you are doing.
Recently, I was benchmarking inference for T5 model on AWS EC2 ( G6E machine with L40 GPU) for batch sizes…
-
### System Info
GPU: a10g
### Who can help?
@kaiyux
### Information
- [X] The official example scripts
- [ ] My own modified scripts
### Tasks
- [X] An officially supported task…