-
There's no details about the output?
```
❯ bacalhau job describe j-acb8621a
ID = j-acb8621a-99f7-408e-9e00-ed03592b7dcf
Name = Run Over Share
Namespace = science
Type…
-
## Issue
Batching does not improve performance with dali.
## Description
In summary, inference slows as we increase batching in our application.
We have an application that sends data to…
-
/kind feature
**Describe the solution you'd like**
To autoscale LLM inference services Knative's request level metrics may not be the best scaling metrics as LLM inference is performed at the toke…
-
Hi, your work is excellent! I would like to ask where I can find your `requirements.txt` file because I can't seem to locate it. I want to know the version of the transformer package. Thank you!
``…
-
**Is your feature request related to a problem? Please describe.**
I'm serving a model that supports batching (`max_batch_size` > 0) and I would like to use config autocomplete, but I don't want to u…
-
o, I have now 4 solid test scenarios thanks to everyone's help here. The have all been tested in cpu mode. I am now switching to nvidia and the docker container doesn't seem to build.
I will be t…
-
Hi there!
I am trying to understand Attention OCR repo and its inference.
I have seen its input/output details -
![image](https://user-images.githubusercontent.com/31642462/135845091-faaede6e-3d90…
-
### Search before asking
- [X] I have searched the Ultralytics YOLO [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussion…
-
When I run `pytest -q -s tests/models/test_bert.py`, the reshaping of qkv fails, with the number of target cells being 3x those in the original tensor.
I've installed the base module as well as tho…
-
我严格参照https://huggingface.co/Qwen/Qwen2-VL-2B-Instruct-GPTQ-Int4的内容进行部署,结果都是提示
Traceback (most recent call last):
File "//abc.py", line 1, in
from transformers import Qwen2VLForConditionalGen…