-
(mlperf) susie.sun@yizhu-R5300-G5:~$ cmr "run mlperf inference generate-run-cmds _submission" --quiet --submitter="MLCommons" --hw_name=default --model=resnet50 --implementation=reference --backend=tf…
-
### System Info
I'm using the current docker image `ghcr.io/huggingface/text-embeddings-inference:turing-1.5` on Debian 11 with CUDA driver 12.2 and an Nvidia T4 GPU.
### Information
- [X] Docker
-…
-
**Description**
I am currently using triton vllm backend for my kubernetes cluster. There are 2 GPUs that Triton is able to see, however it seems to only choose GPU 0 to load the model weights
I h…
-
## Goal
- Jan supports most llama.cpp params
## Tasklist
**Cortex**
- [x] https://github.com/janhq/cortex.cpp/issues/1151
**Jan**
- [ ] Update Right Sidebar UX for Jan
- [ ] Enable Jan's API serv…
-
Hi everyone,
I'm a newbie here and looking for your help.
I have a public pre-trained model from a GPU server (download here https://drive.google.com/drive/folders/0BzY0S4QyX701OFJfbkZ3NmhTb1E). I…
-
Unable to run performance analyzer on my model
I am using a sagemaker wrapper image of triton server and am able to serve the model with requests and even validate that it is up, all ports for grpc, …
-
# Bug description
```
SLEAP: 1.3.4
TensorFlow: 2.7.0
Numpy: 1.19.5
Python: 3.7.12
OS: Linux-5.15.0-122-generic-x86_64-with-debian-bookworm-sid
GPUs: 1/1 available
Device: /physical_device:GPU:0
…
-
### Is there an existing issue for this?
- [X] I have searched the existing issues and checked the recent builds/commits of both this extension and the webui
### Have you updated WebUI and this exte…
-
**Description**
Optional parameters don't seem to work for the pytorch backend. The example below returns ```UNAVAILABLE: Invalid argument: 'optional' is set to true for input 'input' while the backe…
-
I'm following [this README](https://github.com/mlcommons/inference_results_v3.0/tree/main/closed/Intel/code/resnet50/pytorch-cpu) to run R50 inference on an Intel Sapphire Rapids 8 core cloud instance…