-
Thanks for your great work.
I've been testing your model for some image examples, but I found that a single image and a batch image(same duplicated image) gives different model output. Here's a cod…
-
I'm working on integrating this model into a project and need to know if it supports batch inference. If it does, could you provide some guidance or examples on how to implement it? If not, are there …
-
Hi -- has anyone had success with batch inference for Qwen-VL? Other related issues in this repo didn't end up working for me (e.g. https://github.com/QwenLM/Qwen-VL/issues/51), as well as the documen…
-
Hi. Raising this issue as I am experimenting a much slower inference time with Gemma-1 models.
> Environment:
> - xformers 0.0.26.post1 pypi_0 pypi
> - unsloth …
-
Hi authors,
I want to test the performance of the Mistral7B on the test dataset. Is it only possible to do single sample inference (with model. generate(...))? Are there any methods to accelerate t…
-
How do we figure out scaling requests using a COG? As I understand it, since the models use a GPU and only one process can use a GPU, how do we scale for 100s of requests / second?
Is there anyway …
-
Thanks for your excellent work.
I see that onnx model (for example Vit converted to onnx) is potential if it can inference with batch inputs because of reducing time and boosting performance.
N…
-
### Question
Hi, I've been extracting information about images using inference examples for some time now, can you tell me how to do batch inference and example code.
Thank you.
-
A great job, are there any tips on setting up bounding boxes to perform batch inferencing?
iu110 updated
4 months ago
-
### System Info / 系統信息
xinference v0.13.2
其中vllm并不支持batching inference。使用openai的batching prompts就会报500错误。
为什么不参考
https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/openai/serving_…
bstr9 updated
3 weeks ago