inference-server Search Results

1000+ results
for inference-server

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

NVIDIA/TensorRT-LLM #984

How to serve multiple TensorRT-LLM models in the same proces…

Hi there! I'm trying to serve multiple TensorRT-LLM models and I'm wondering what the recommended approach is. I'm using Python to serve TensorRT-LLM models. I've tried / considered: - `GenerationS…

cody-moveworks updated 2 months ago
6
xenova/transformers.js #897

The inference results on the same model are inconsistent whe…

### System Info "@huggingface/transformers": "^3.0.0-alpha.5" ### Environment/Platform - [X] Website/web-app - [X] Browser extension - [ ] Server-side (e.g., Node.js, Deno, Bun) - [ ] Des…

helloburke updated 2 weeks ago
1
FunAudioLLM/CosyVoice #445

RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'

问题：通过webui.py运行，推理模式选择预训练音色，点击生成音频报错，服务端显示：RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'，具体报错信息如下： 2024-09-27 16:32:01,942 INFO get sft inference request tn 我是通义实验室语音团队全新推出的生成式语音大模型，提…

willmyc1 updated 3 days ago
1
MeetKai/functionary #197

No module named 'lmformatenforcer'

Traceback (most recent call last): File "/home/hirpa/fzq/2024/code/functionary/server_vllm.py", line 39, in from functionary.vllm_inference import process_chat_completion File "/home/hirpa…

Mars-1990 updated 4 months ago
1
jongcye/DeepEPI.ghost.correction #1

LFS files not found on server

Dear developers, I decided to clone the code/demo but apparently the data files are not present on the LFS servers. I get the following error during clone: ``` Downloading Inference/db/imdb_raw…

shahdloo updated 3 months ago
1
InternLM/lmdeploy #2041

[Feature] Support logprob in VLM api server

### Motivation I found that the input token logprob is supported by Offline Inference Pipeline, as mentioned in [doc](https://lmdeploy.readthedocs.io/en/latest/inference/vl_pipeline.html#calculate-lo…

cjfcsjt updated 2 months ago
1
immich-app/immich #12930

Getting error in ASGI failed to allocate memory

### The bug I am just looking at my logs because of an issue I am having with facial recognition, these errors are unrelated as they happened during the night, but I wanted to draw some attention to …

rayzorben updated 5 days ago
5
containers/podman-desktop-extension-ai-lab #1741

Prototype integration of openweb-ui

### Is your enhancement related to a problem? Please describe Seems openweb-ui can be integrated through a container, would be good to prototype ### Describe the solution you'd like Replace current…

jeffmaury updated 1 week ago
2
cpacker/MemGPT #1577

/MemGPT/memgpt/data_types.py:92: UserWarning: Failed to put …

**Describe the bug** After update 0.3.21 Getting --> 2024-07-27 13:34:07,646 - MemGPT.memgpt.server.server - DEBUG - Starting agent step /MemGPT/memgpt/data_types.py:92: UserWarning: Failed to…

quantumalchemy updated 2 months ago
1
triton-inference-server/server #7526

How to send the byte or string data in array in perf analyze…

Triton inference server:r24.07 and model_analyzer:1.42.0 config.pbtxt ``` backend: "python" max_batch_size: 32 input [ { name: "IN0" data_type: TYPE_STRING dims: [ 16 ] } ]…

Kanupriyagoyal updated 3 weeks ago
3

上一页 1...14 15 16 17 18 19 20...100 下一页

1000+ results for inference-server

1000+ results
for inference-server