inference-server Search Results

1000+ results
for inference-server

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

triton-inference-server/server #7228

Model Management

Can I specify a specific version to load or upload when using triton-inference-server for model management? I only found the following two APIs: Load model: v2/repository/models/{model-name}/load …

N-Kingsley updated 6 months ago
1
OpenBMB/ToolBench #188

AttributeError: 'Namespace' object has no attribute 'max_seq…

CUDA SETUP: Highest compute capability among GPUs detected: 8.0 CUDA SETUP: Detected CUDA version 111 CUDA SETUP: Loading binary /usr/local/lib/python3.8/dist-packages/bitsandbytes/libbitsandbytes_c…

shutter-cp updated 1 year ago
2
immich-app/immich #14043

Copy Image Error

### The bug Hello, I get this error when copying the image after using the next and previous button https://github.com/user-attachments/assets/bc878a6f-94d8-47d4-bc92-b77aafa59dcc ### The OS …

amirulasri updated 1 week ago
1
tensorflow/serving #2214

OP_REQUIRES failed at xla_ops : UNIMPLEMENTED: Could not fin…

## Bug Report Does Tensorflow Serving support XLA compiled SavedModels ? or am I doing something wrong ? ### System information - **OS Platform and Distribution (e.g., Linux Ubuntu 16.04)**: [D…

kmkolasinski updated 5 months ago
8
NVIDIA/TensorRT-LLM #1821

CUDA runtime error in cudaDeviceGetDefaultMemPool on [window…

Hi experts, I'm running a 1.3B model on windows with 16GB V100 with below envs, but hit an issue which I couldn't find any clue. Could you please help check it. TensorRT-LLM version: tag v0.10.0…

ljayx updated 4 days ago
5
huggingface/chat-ui #899

Bug--Llama-2-70b-chat-hf error: `truncate` must be strictly …

I use the docker image chat-ui-db as the frontend, text-generation-inference as the inference backend, and meta-llamaLlama-2-70b-chat-hf as the model. In the model field of the .env.local file, I hav…

majestichou updated 8 months ago
4
intel-analytics/ipex-llm #8998

Optimize the model used in Ant group for inference

There is a BERT based model used in Ant group for inference on geo similarity compare. https://modelscope.cn/models/damo/mgeo_geographic_entity_alignment_chinese_base/summary https://modelscope.cn/m…

qzheng527 updated 1 year ago
3
typedb/typedb #7070

During reasoning, nested patterns are not correctly bound if…

## Description There exists a strange combination of factors when inference can cause the reasoner to act just like it ignores a `not` subquery, leading to incorrect results. I have a query that c…

farost updated 6 months ago
1
triton-inference-server/server #7007

Triton ensemble pipeline high CPU usage

**Description** I have a 5 steps ensemble pipeline for triton. * 3 steps are torchscript artifacts * 2 steps are tensorrt compiled models in pbtxts files I have ``` instance_group [{ kind: KIN…

sergeevii123 updated 7 months ago
2
xdit-project/xDiT #325

if i test it with 2 concurecy , it will run into error. erro…

if i test it with 2 concurecy , it will run into error. error detail is : Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/flask/app.py", line 1473,…

James-Dao updated 6 days ago
7

上一页 1...71 72 73 74 75 76 77...100 下一页

1000+ results for inference-server

1000+ results
for inference-server