-
**Describe the bug**
My CPU is Ultra 7 258v, and the system is Windows 11Home 24H2. I just tried running the qwen2.5-7b-instruct-model using your example code for the first time. However, I noticed t…
-
### System Info
Hi Team,
When deploying the model on AWS with `huggingface-pytorch-tgi-inference:2.3.0-tgi2.2.0`, I got the above error.
Could you tell me when can TGI provide the new image? Is t…
-
Hi all,
I am urgently seeking to deploy the TFLite models converted using Larq Compute Engine (LCE) on an ARM32 device, specifically a Cortex-M7 CPU, the STM32F7 series MCU.
I have seen some rel…
-
### bug描述 Describe the Bug
File "/data/mlops/Open-Assistant/inference/server/oasst_inference_server/plugins/vectors_db/loaders/data_loader.py", line 383, in path_to_doc1
res = file_to_doc(file, …
-
Chai-1 is limited to 2048 tokens (token=canonical AA or atom), and the main reason is high memory consumption.
We received several requests to support larger crop sizes, but it requires _significa…
-
Hey team,
First of all, thanks for the effort you are doing for this amazing project.
I would like to ask for the support of a recently and very important addition in AWS Bedrock, that's the cross-r…
DEG-7 updated
1 month ago
-
### Self Checks
- [X] I have thoroughly reviewed the project documentation (installation, training, inference) but couldn't find any relevant information that meets my needs. [English](https://speech…
-
If the compiler hits an inference error, it doesn't provide a line number for the error, nor does it provide any information in the error message as to what caused the error.
E.g.
```
y = lambda Mou…
-
### System Info
Name: peft
Version: 0.13.2
### Who can help?
when i try to load the adapter for inference its showing the following error.
`TypeError: LoraConfig.__init__() got an unexpected ke…
-
**Description**
Error
```
model_instance_state.cc:1117] "Failed updating TRT LLM statistics: Internal - Failed to find Max KV cache blocks in metrics."
```
when kv cache is disabled when building…