batch-inference Search Results

1000+ results
for batch-inference

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

bacalhau-project/bacalhau #4247

Where did this job save its stuff?

There's no details about the output? ``` ❯ bacalhau job describe j-acb8621a ID = j-acb8621a-99f7-408e-9e00-ed03592b7dcf Name = Run Over Share Namespace = science Type…

aronchick updated 3 weeks ago
2
triton-inference-server/dali_backend #178

Batching does not improve performance with dali

## Issue Batching does not improve performance with dali. ## Description In summary, inference slows as we increase batching in our application. We have an application that sends data to…

hly0025 updated 11 months ago
10
kserve/kserve #3561

Native integration with KEDA for LLM inference autoscaling

/kind feature **Describe the solution you'd like** To autoscale LLM inference services Knative's request level metrics may not be the best scaling metrics as LLM inference is performed at the toke…

yuzisun updated 4 months ago
4
gecgomes/ICD_Coding_MSAM #1

The version of the transformer package

Hi, your work is excellent! I would like to ask where I can find your `requirements.txt` file because I can't seem to locate it. I want to know the version of the transformer package. Thank you! ``…

LUOyk1999 updated 1 month ago
4
triton-inference-server/onnxruntime_backend #147

Allow dynamic batch scheduler to be disabled when autocomple…

**Is your feature request related to a problem? Please describe.** I'm serving a model that supports batching (`max_batch_size` > 0) and I would like to use config autocomplete, but I don't want to u…

OvervCW updated 1 year ago
6
mlcommons/inference #1830

nvidia build issues

o, I have now 4 solid test scenarios thanks to everyone's help here. The have all been tested in cpu mode. I am now switching to nvidia and the docker container doesn't seem to build. I will be t…

howudodat updated 2 weeks ago
19
tensorflow/models #10293

Inference for AttentionOCR model

Hi there! I am trying to understand Attention OCR repo and its inference. I have seen its input/output details - ![image](https://user-images.githubusercontent.com/31642462/135845091-faaede6e-3d90…

neso613 updated 1 year ago
11
ultralytics/ultralytics #15837

It is hoped that the exported model can infer two pictures a…

### Search before asking - [X] I have searched the Ultralytics YOLO [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussion…

liangjiegao updated 1 week ago
1
Dao-AILab/flash-attention #526

Tests failing for BERT

When I run `pytest -q -s tests/models/test_bert.py`, the reshaping of qkv fails, with the number of target cells being 3x those in the original tensor. I've installed the base module as well as tho…

kevinhu updated 1 year ago
1
QwenLM/Qwen2-VL #139

关于 Qwen2-VL-7B-Instruct-GPTQ-Int4部署应用的问题

我严格参照https://huggingface.co/Qwen/Qwen2-VL-2B-Instruct-GPTQ-Int4的内容进行部署，结果都是提示 Traceback (most recent call last): File "//abc.py", line 1, in from transformers import Qwen2VLForConditionalGen…

gy850222 updated 1 day ago
1

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for batch-inference

1000+ results
for batch-inference