-
So faster-whisper is built using CTranslate2 and checking the CTranslate2 github, they say:
> "Multiple CPU architectures support
The project supports x86-64 and AArch64/ARM64 processors and int…
-
# Bug description
I have 40 videos of single animal interence. I wanna got my prediction using a trained model trained by me. It worked for 2 month and just out of a sudden i got the following …
-
### Describe the issue
We are seeing an issue with a Transformer model which was exported using torch.onnx.export and then optimized with optimum ORTOptimizer. Inferencing seems to not be using GPU a…
-
### System Info
```Shell
- `Accelerate` version: 0.35.0.dev0
- Platform: Linux-5.15.0-121-generic-x86_64-with-glibc2.35
- `accelerate` bash location: redacted
- Python version: 3.10.14
- Numpy…
-
### Your current environment
```text
Collecting environment information...
WARNING 10-29 12:20:54 _custom_ops.py:19] Failed to import from vllm._C with ModuleNotFoundError("No module named 'vllm._C…
-
If I try to call `Libdl.dlpath` in LinearAlgebra's `__init__`, the Julia build fails. Calling `Libdl.dlext` works fine though.
```
Sysimage built. Summary:
Total ─────── 57.897861 seconds
Base…
-
### 🐛 Describe the bug
I wasn't sure if this belonged here, but since the error message specify to create a bug report for Pytorch I will
Traceback
```
Traceback (most recent call last):
File…
-
# Latency in OpenVINO Model Server Inside Kubernetes Cluster
## To Reproduce
**Steps to reproduce the behavior:**
1. **Prepare Models Repository**: Followed standard procedures to set up the mode…
-
### 🐛 Describe the bug
TorchServe version is 0.10.0.
It's my code:
```
def get_inference_stub(address: str, port: Union[str, int]= 7070):
channel = grpc.insecure_channel(address + ':' + str(p…
-
### Reminder
- [X] I have read the README and searched the existing issues.
### Reproduction
Hi There, I am observing a difference in output between llama factory inference and llama.cpp.
I am…
anidh updated
1 month ago