inference-api Search Results

1000+ results
for inference-api

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

elastic/elasticsearch #109638

[ML] Improve inference API validation and exceptions

### Description Improve validation and exception handling within the inference API. Here are a few areas to get started - When a text embedding service is created, during the creation process w…

jonathan-buttner updated 4 months ago
1
pytorch/pytorch #138888

How to Implement multi-card parallel Inference by torchrun?

Hello everyone, I'm trying to achieve a goal of using trochrun for dual-card parallel inference. Then I have two questions. First, I found that torchrun is mainly used for model training, so can it be…

lcf2610 updated 1 day ago
12
meta-llama/llama-stack-apps #101

The examples mostly don’t work and are hard to understand.

### 🚀 The feature, motivation and pitch I insist on trying to learn and study the llama-stack API and its examples, but they are complex. There’s no documentation, or I’m unable to access it. I wa…

JoseGuilherme1904 updated 6 days ago
2
vllm-project/vllm #9240

Questions about the inference performance of the GPTQ model

**Why is it that when using a quantitative model for inference, the TTFT optimization is not obvious, but the overall inference efficiency is improved a lot? At the same time, the inference efficiency…

Rssevenyu updated 2 weeks ago
4
meta-llama/llama-stack #190

stack tool cannot support large models with a .pth extension…

The stack tool cannot support large models with a .pth extension downloaded from Meta. It throws an error during runtime. Does it have to use models downloaded from Hugging Face? Is this setup unreaso…

Itime-ren updated 3 weeks ago
2
meta-llama/llama-stack #242

I used the official Docker image and downloaded the weight f…

I used the official Docker image and downloaded the weight file from Meta. The md5sum test proved that the file was fine, but it still failed to run, which left me confused，I confirm that CUDA can be …

Itime-ren updated 2 weeks ago
2
FunAudioLLM/CosyVoice #602

使用3s极速复刻时报错

我是用官方文档的命令启动webui `python3 webui.py --port 50000 --model_dir pretrained_models/CosyVoice-300M` 启动后界面可以正常打开，如下上面是我的配置，点击运行后提示错误，终端输出如下： ``` 2024-10-28 11:15:36,546 INFO get zero_shot inferenc…

keyiis updated 3 days ago
2
meta-llama/llama-stack #194

I am puzzled as to why stack needs to link it to the address…

downloaded the 1B model from Huggingface and encountered an error while running it. The following is the configuration process, and I am puzzled as to why I need to link it to the address [: ffff: 0.0…

Itime-ren updated 1 month ago
4
elastic/elasticsearch #110992

[ML] Inference API request hangs when passing an invalid fie…

### Description The inference API supports text embedding and rerank task types. If a inference endpoint is created for text embedding, and a request is made to perform inference and the request co…

jonathan-buttner updated 2 months ago
2
microsoft/autogen #3504

[Issue]: how autogen support multiple local model.

### Describe the issue according to the [Local-LLMs/](https://microsoft.github.io/autogen/blog/2023/07/14/Local-LLMs/), the autogen can support multiple local llm. my command for fastchat First,…

lambda7xx updated 1 month ago
2

上一页 1...11 12 13 14 15 16 17...100 下一页

1000+ results for inference-api

1000+ results
for inference-api