-
How to easily take advantage of batch processing during inference?
-
What kind of changes do we need to make to enable batch inference? Currently the boxes have the shape
(num_detections, 4) instead of (None, num_detections, 4)
-
I tested the batch inference results of the llava and llava-next-video models using tensorrt-llm based on the examples/multimodal/run.py file. The parameters for their generate method are the same, as…
-
Batch inference is slower than sheet inference on GPU 3080, Java version is 1.8.0_181, Onnxruntime version is 1.7.0 .
For example , inference 10 pictures one by one, total cost 2.87 second,
but…
-
Hello,
I've been using the model for some experiments with code based on the `demo/ctw1500_detection.py`. I would like to move from `VisualizationDemo` because it does not support batch inference. Lo…
-
Hi,
I need to perform batch inference for my use-case. I followed this thread [here](https://github.com/facebookresearch/detectron2/issues/282#issuecomment-562386367) that extends the `DefaultPredi…
-
Thanks for this amazing example of running TensorRT with the Python API, instead of the C++ API, it is really helpful.
Would it be possible to provide an example on how to do batch inference, if this…
ghost updated
3 years ago
-
Hello mlcommons team,
I want to run the "Automated command to run the benchmark via MLCommons CM" (from the example: https://github.com/mlcommons/inference/tree/master/language/llama2-70b) with a d…
-
Is bolt support batch inference , could I inference 2 or more sentence at the same time ?
-
### Checklist
- [X] 1. I have searched related issues but cannot get the expected help.
- [ ] 2. The bug has not been fixed in the latest version.
- [ ] 3. Please note that if the bug-related issue y…