batch-inference Search Results

1000+ results
for batch-inference

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

xorbitsai/inference #1779

how to enable batch inference with embedding/rerank model?

## 问题描述用UI启动的embedding/rerank模型，没有并发相关的设置客户端用asyncio、concurrent.futures方式发送请求，速度竟然比同步的for loop还慢 **应该怎么能使模型并发推理？** ## xinference侧启动的模型 embedding: rerank: ## 测试结果 ### embedding接口测…

SunLemuria updated 1 month ago
1
CASIA-IVA-Lab/FastSAM #147

batch inference

Do you have code for batch processing images?I want to use my own dataset for batch inference.Looking forward to your reply

Saiga-fire updated 1 year ago
2
ultralytics/ultralytics #15833

Questions about the track module

### Search before asking - [X] I have searched the Ultralytics YOLO [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussion…

AntyRia updated 1 week ago
14
ai-forever/Real-ESRGAN #20

batch inference

batch inference dont seem to be working. Would you mind to provide an example of batch inference for model.predict. It seems that it only works for the batch size of 1.

LilyTheBear updated 10 months ago
1
NVIDIA-AI-IOT/nanoowl #15

Batch inference time

Hi, We have successfully created batch version of the model using onnx and trt. We are trying this on a A10 GPU, here is what we have observed: for a batch of 16 we get 96ms inference time and if w…

omidb updated 7 months ago
1
NVIDIA/TensorRT-LLM #2072

`-1` token id with Mixtral FP8 and tensorrt_llm 0.11.0

- CPU architecture: x86_64 - GPU: NVIDIA H100 - Libraries - TensorRT-LLM: v0.11.0 - TensorRT: 10.1.0 - Modelopt: 0.13.1 - CUDA: 12.3 - NVIDIA driver version: 535.129.03 Hello, I'm e…

v-dicicco updated 5 days ago
5
elastic/elasticsearch #104932

[ML] Inference Request Batching

### Description The current implementation of the Inference API is to send each request individually as they are received. There are adjustable limits to how many requests can be sent concurrently.…

jonathan-buttner updated 4 months ago
5
ultralytics/ultralytics #14823

Performance Issues with OpenVINO export

### Search before asking - [X] I have searched the Ultralytics YOLO [issues](https://github.com/ultralytics/ultralytics/issues) and found no similar bug report. ### Ultralytics YOLO Component …

ambitious-octopus updated 1 month ago
14
unslothai/unsloth #772

Llama3+Unsloth+PEFT with batched inference, and apply_chat_…

So this is a strange one. I am stumped. In way, this is sort of like #416, but I confirmed that if Batch==1, then the problem does not occur. (See below) My inference loop looks like this ``` …

devzzzero updated 4 weeks ago
9
gooofy/zamia-speech #109

Batch Inference

Is it possible to run inference in batches instead of one by one? If so please suggest me some approach.

rockiram updated 4 years ago
2

上一页 1...5 6 7 8 9 10 11...100 下一页

1000+ results for batch-inference

1000+ results
for batch-inference