inference-server Search Results

1000+ results
for inference-server

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

guidance-ai/guidance #599

Ollama support

I use Ollama as my inference server for local LLMs. Ollama is supported by many LLM frameworks, but not Guidance. Would love to see a direct integration with Ollama via the models package. I'm awa…

lestan updated 5 months ago
2
Project-MONAI/MONAILabel #1621

Secondary Capture Images and Send to XNAT

**Is your feature request related to a problem? Please describe.** Segmentation viewing and export to PACS/VNAs outside of the integrated OHIF viewer and/or Slicer. In order to integrate inference pr…

kenbutcher updated 9 months ago
2
hpcaitech/EnergonAI #125

inference of pre-trained model

Hi, I am very interested in the distributed inference of Colossal AI. Since we have pre-trained NLP models from Pytorch or JAX, I wonder if possible or what should be done to use EnergonAI for infere…

Emerald01 updated 2 years ago
1
vllm-project/vllm #7983

[Usage]: Deploying multimodal retrieval models

### Your current environment ```text The output of `python collect_env.py` ``` ### How would you like to use vllm I want to run inference of [ColPali](https://huggingface.co/vidore/colpali). I …

sky-2002 updated 1 month ago
11
open-mmlab/mmdetection3d #2857

[Bug] Det3DLocalVisualizer 对象的 show 方法中，该对象缺少一个名为 view_contr…

### Prerequisite - [X] I have searched [Issues](https://github.com/open-mmlab/mmdetection3d/issues) and [Discussions](https://github.com/open-mmlab/mmdetection3d/discussions) but cannot get the expec…

zyu updated 2 months ago
7
weaviate/semantic-search-through-wikipedia-with-weaviate #8

fail with status 500: CUDA error: no kernel image is availab…

Hi, I try to reproduce step 2 of the semantic search through wikipedia on my local computer with RTX 3090, and while importing data with the `nohup python3 -u import.py &` command I got the follow…

Matthieu-Tinycoaching updated 2 years ago
3
triton-inference-server/server #7477

Exllamav2 inference with EXL Quants

Do you support Exllamav2 backend for the inference that supports exl quants? The current alternative is vllm but that doesn't support EXL quants. Also, after running a perplexity test, EXL is the b…

rjmehta1993 updated 3 months ago
1
vllm-project/vllm #4635

[Bug]: Not picking up Neuron on instance (AssertionError: CU…

### Your current environment ```text Collecting environment information... /opt/conda/lib/python3.10/site-packages/transformers/utils/hub.py:124: FutureWarning: Using `TRANSFORMERS_CACHE` is deprec…

milo157 updated 2 weeks ago
2
triton-inference-server/tensorrtllm_backend #337

modelInstanceState: [json.exception.out_of_range.403] key 'b…

**Description** Trying to deploy Mistral-7B with Triton+TensorRT-LLM and running into this issue **Triton Information** Are you using the Triton container or did you build it yourself? nvcr.i…

shamikatamazon updated 7 months ago
12
netease-youdao/QAnything #75

Triton服务启动超时，在models里面有个日志文件QAEnsemble.log

最后的日志显示： qanything-container-local | Triton服务正在启动，可能需要一段时间...你有时间去冲杯咖啡 :) qanything-container-local | The triton service is starting up, it can be long... you have time to make a coffee :) qanyth…

Wimet7 updated 9 months ago
8

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for inference-server

1000+ results
for inference-server