inference-server Search Results

1000+ results
for inference-server

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

NVIDIA/TensorRT-LLM #2101

Finding protobuf files while benchmarking TensorRT-LLM

### System Info I am working on the benchmarking suite in vLLM team, and now trying to run TensorRT-LLM for comparison. I am relying on this github repo (https://github.com/neuralmagic/tensorrt-demo)…

KuntaiDu updated 3 weeks ago
3
YuanGongND/ltu #49

Inference of 13B (Beta)

Is there any client API for LTU-AS (13B) ? I cannot find the 13B checkpoints in the GitHub repo. And the API only support "7B (Default)" and does not support "13B (Beta)"

nicolaus625 updated 2 days ago
3
NVIDIA/TensorRT-LLM #1943

[new] discord channel for tensorrt

### System Info Hi, I noticed there is no slack, discord or irc channel for tensorrt - which could offload some future tickets by discussing things in the channel - so I created one. I hope its…

geraldstanje updated 1 month ago
1
Sinaptik-AI/pandas-ai #1322

"Unfortunately, I was not able to get your answers, because …

### System Info pandasai==2.2.14 Python 3.10.12 ### 🐛 Describe the bug ``` import torch from transformers import AutoModelForCausalLM, AutoTokenizer, AwqConfig model_id = "hugging-quants/Meta…

Rumeysakeskin updated 1 month ago
1
triton-inference-server/server #7035

RAM memory growth of triton server, until killed by OS

Im using nvcr.io/nvidia/tritonserver:23.10-py3 container for my inferencing, using C++ GRPC API. There is several models in container, Yolov8-like architecture in Tensorrt plus a few Torchscript model…

InfiniteLife updated 5 months ago
4
triton-inference-server/server #6358

Is there any plan to open source Inflight Batching for LLM S…

We are using Triton Inference Server for model inference and currently facing throughput bottlenecks with LLM inference. I saw in a public video that Nvidia has optimized LLM serving by supporting `In…

liuyang-my updated 5 months ago
2
riffusion/riffusion-hobby #167

INFO:werkzeug:WARNING

**ON server:** ` INFO:werkzeug:WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead. * Running on http://127.0.0.1:3013 INFO:w…

merolaika updated 8 months ago
1
elastic/kibana #178247

Ingest Pipeline UI cannot use Inference API style Inference …

**Kibana version:** 8.14.0-SNAPSHOT **Elasticsearch version:** 8.14.0-SNAPSHOT **Server OS version:** OSX 14.3 **Original install method (e.g. download page, yum, from source, etc.):** sour…

seanstory updated 6 months ago
2
pytorch/serve #3168

The service crashes if the model takes a long time to respon…

### 🐛 Describe the bug TorchServe version is 0.10.0. It's my code: ``` def get_inference_stub(address: str, port: Union[str, int]= 7070): channel = grpc.insecure_channel(address + ':' + str(p…

yurkoff-mv updated 1 month ago
12
TheFoundryVisionmongers/nuke-ML-server #12

Dense Pose crashes

Server -> Receiving message of size: 24883378 Server -> 24883378 bytes read Server -> Message parsed Server -> Received inference request Server -> Requesting inference on model: densepose Server…

samhodge updated 5 years ago
3

上一页 1...27 28 29 30 31 32 33...100 下一页

1000+ results for inference-server

1000+ results
for inference-server