-
### Search before asking
- [X] I have searched the Inference [issues](https://github.com/roboflow/inference/issues) and found no similar bug report.
### Bug
## Set Up
I use a Basler Camera acA1…
-
Hi,
I've trying to serve different Phi3 models using the Llama.cpp server that is created by the init-llama-cpp ipex.
When I server with this version I have two problems:
1) The server doesn…
hvico updated
1 month ago
-
### Search before asking
- [X] I have searched the HUB [issues](https://github.com/ultralytics/hub/issues) and found no similar bug report.
### HUB Component
Inference
### Bug
3 issues listed be…
-
**Describe the package you'd like added**
`vllm` has become a popular inference server for LLMs: https://github.com/vllm-project/vllm
**Describe how this package fits in with the project**
GenAI/…
-
When I use faster-rcnn TRT model inference server, there is no error reported, it works well. But I found a strange phenomenon that when I try to send a series of pictures to model at the same time, i…
-
volodya
High
# forecast-implied inferences can be set to any value due to ForecastElements is not filtered by duplicate.
## Summary
forecast-implied inferences can be set to any value due to Foreca…
-
- 环境
- docker: registry.cn-hangzhou.aliyuncs.com/havenask/rtp_llm:0.1.13_cuda12
- cuda: 12.1
- driver: 515.105.01
- 模型:
- llama: https://huggingface.co/lmsys/vicuna-33b-v1.3
…
-
there are two `gen_random_start_ids` in tools/utils/utils.py
https://github.com/triton-inference-server/tensorrtllm_backend/blob/ae52bce3ed8ecea468a16483e0dacd3d156ae4fe/tools/utils/utils.py#L238-L…
-
Hi there,
First thank you for unsloth, it's great!
I've finetuned a llama-3-8b-Instruct-bnb-4bit and pushed it to hf hub. When I try to deploy it using [hf Inference Endpoints](https://huggingfa…
-
Tracking the second round of issues submitted to [triton-inference-server](https://github.com/triton-inference-server/server):
- [ ] https://github.com/triton-inference-server/server/issues/2018: Con…