inference-server Search Results

1000+ results
for inference-server

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

NJU-Jet/SR_Mobile_Quantization #9

it seems the inference is very slow on my linux server?

Hi, Dear NJU-Jet my linux server: several 2.6GHz CPU + several V100, and I run the **generate_tflite.py** to got a quantized model. and then in function **evaluate**, I add below code to measu…

xiaoxiongli updated 2 years ago
5
triton-inference-server/fastertransformer_backend #162

Can i stop execution? (w/ `decoupled mode`)

### Description ```shell Docker: nvcr.io/nvidia/tritonserver:23.04-py3 Gpu: A100 How can i stop bi-direction streaming(decoupled mode)? - I want to stop model inference(streaming response) when …

Yeom updated 1 year ago
1
huggingface/optimum-neuron #684

Cannot host Llama-3-8B exported by optimum-neuron with TGI c…

### System Info ```shell AWS EC2 instance: trn1.32xlarge OS: Ubuntu 22.04.4 LTS Platform: - Platform: Linux-6.5.0-1023-aws-x86_64-with-glibc2.35 - Python version: 3.10.12 Python packages: …

cszhz updated 1 month ago
3
SthPhoenix/InsightFace-REST #123

Thank you for excllent work. How about TRT batch inference?

Thank you for excllent work. > Detection models now can be exported to TRT engine with batch size > 1 - **inference code doesn't support it yet**, though now they could be used in Triton Inference Se…

tungdq212 updated 10 months ago
1
matt-m-o/YomiNinja #43

Yomininja doesn't start after updating to 0.7.2

Updating my Yomininja results in the program not being able to start, I saw a similar issue but I can't really read this so I have no idea if it's related to that OCR engine, I use lens. ``` PS C:…

trektn updated 1 month ago
3
yerfor/MimicTalk #18

fatal IO error 2 (No such file or directory) on X server ":0…

(mimctalk) tom@tom-System:~/MimicTalk$ python inference/train_mimictalk_on_a_video.py cp checkpoints/mimictalk_orig/os_secc2plane_torso/config.yaml checkpoints_mimictalk/GER /home/tom/miniconda3/env…

tomchen008 updated 6 days ago
2
xliangwu/coder_km #2

[Vssue]12.Triton Inference Server | 代码驱动科技

http://www.nowcode.cn/nav.05.%E4%BA%BA%E5%B7%A5%E6%99%BA%E8%83%BD/12.Triton-Inference.html

xliangwu updated 1 year ago
1
openvinotoolkit/model_server #2719

OpenAI API completions endpoint - Not working as expected

I have downloaded LLAMA 3.2 1B Model from Hugging face with optimum-cli optimum-cli export openvino --model meta-llama/Llama-3.2-1B-Instruct llama3.2-1b/1 Below are files downloaded !…

anandnandagiri updated 3 weeks ago
12
NVIDIA/TensorRT-LLM #2292

usage in deepstream

Hey, can you kindly tell if we can integrate a trt llm built engine (whisper to be pricese) in deepstream pipeline ? As per my knowledge we can either use a trt engine directly (not sure about trt-llm…

haiderasad updated 3 weeks ago
2
h2oai/h2ogpt #1227

Enable to use custom inference api which is deployed in exte…

I want to run h2ogpt just with inference api, without specifying basemodel name. For example, I have my llama model deployed in external server and exposes api to inference it. Hence I want to cons…

iamrajaraj updated 10 months ago
6

上一页 1...48 49 50 51 52 53 54...100 下一页

1000+ results for inference-server

1000+ results
for inference-server