inference-server Search Results

1000+ results
for inference-server

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

mlcommons/inference #1709

CM error: no scripts were found with above tags and variatio…

(mlperf) susie.sun@yizhu-R5300-G5:~$ cmr "run mlperf inference generate-run-cmds _submission" --quiet --submitter="MLCommons" --hw_name=default --model=resnet50 --implementation=reference --backend=tf…

sunpian1 updated 5 months ago
2
huggingface/text-embeddings-inference #347

Numerical issues with gte-large-en-v1.5

### System Info I'm using the current docker image `ghcr.io/huggingface/text-embeddings-inference:turing-1.5` on Debian 11 with CUDA driver 12.2 and an Nvidia T4 GPU. ### Information - [X] Docker -…

netw0rkf10w updated 3 months ago
11
triton-inference-server/server #7786

Triton x vLLM backend GPU selection issue

**Description** I am currently using triton vllm backend for my kubernetes cluster. There are 2 GPUs that Triton is able to see, however it seems to only choose GPU 0 to load the model weights I h…

Tedyang2003 updated 5 days ago
2
janhq/jan #3508

feat: Jan supports most llama.cpp params

## Goal - Jan supports most llama.cpp params ## Tasklist **Cortex** - [x] https://github.com/janhq/cortex.cpp/issues/1151 **Jan** - [ ] Update Right Sidebar UX for Jan - [ ] Enable Jan's API serv…

imtuyethan updated 2 weeks ago
1
OpenNMT/OpenNMT #521

Need help to release a pre-trained model from a GPU server

Hi everyone, I'm a newbie here and looking for your help. I have a public pre-trained model from a GPU server (download here https://drive.google.com/drive/folders/0BzY0S4QyX701OFJfbkZ3NmhTb1E). I…

hoangyenan updated 6 years ago
8
triton-inference-server/perf_analyzer #68

Unable to run performance analyzer on my model - Request for…

Unable to run performance analyzer on my model I am using a sagemaker wrapper image of triton server and am able to serve the model with requests and even validate that it is up, all ports for grpc, …

vijetha35 updated 2 months ago
8
talmolab/sleap #1993

Training centered_instance model with input_scaling < 1.0

# Bug description ``` SLEAP: 1.3.4 TensorFlow: 2.7.0 Numpy: 1.19.5 Python: 3.7.12 OS: Linux-5.15.0-122-generic-x86_64-with-debian-bookworm-sid GPUs: 1/1 available Device: /physical_device:GPU:0 …

BenjaminBo updated 1 day ago
4
continue-revolution/sd-webui-segment-anything #161

[Bug]: VRAM not being freed the same even after process is f…

### Is there an existing issue for this? - [X] I have searched the existing issues and checked the recent builds/commits of both this extension and the webui ### Have you updated WebUI and this exte…

monstari updated 8 months ago
1
triton-inference-server/server #6523

Optional arguments to PyTorch models produce errors during T…

**Description** Optional parameters don't seem to work for the pytorch backend. The example below returns ```UNAVAILABLE: Invalid argument: 'optional' is set to true for input 'input' while the backe…

kylelyk updated 10 months ago
2
mlcommons/inference_results_v3.0 #14

Not able to regenerate R50 model on Intel Sapphire Rapids (u…

I'm following [this README](https://github.com/mlcommons/inference_results_v3.0/tree/main/closed/Intel/code/resnet50/pytorch-cpu) to run R50 inference on an Intel Sapphire Rapids 8 core cloud instance…

arjunsuresh updated 1 year ago
16

上一页 1...56 57 58 59 60 61 62...100 下一页

1000+ results for inference-server

1000+ results
for inference-server