llm-serving Search Results

1000+ results
for llm-serving

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

hegelai/prompttools #68

Support for OSS Models behind an API?

### 🚀 The feature Right now mainly proprietary LLMs are supported. Would be great to also support DIY/OSS LLMs - for instance, hosted in [Databricks Model Serving](https://docs.databricks.com/en/mach…

rafaelvp-db updated 1 year ago
1
mlflow/mlflow #11962

[BUG] Exception During Model Logging with Custom Ollama Clas…

### Issues Policy acknowledgement - [X] I have read and agree to submit bug reports in accordance with the [issues policy](https://www.github.com/mlflow/mlflow/blob/master/ISSUE_POLICY.md) ### W…

sreekarreddydfci updated 3 months ago
6
triton-inference-server/client #562

make cc-clients: Could not find requested file: RapidJSON-ta…

cmake is not successful ``` ❯ cmake --version cmake version 3.21.0 CMake suite maintained and supported by Kitware (kitware.com/cmake). ``` ``` mkdir build cd build cmake -DCMAKE_INSTA…

hayleyhu updated 1 month ago
2
NVIDIA/TensorRT-LLM #1704

24.05-trtllm-python-py3 image size

Hello, I'm using 24.03-trtllm-python-py3 with image size 8.38 GB which is not small but ok. I'm going to migrate to the newest versions like 24.04 or 24.05 but it size drastically increased to 18.46 …

Prots updated 2 months ago
8
cipher982/llm-benchmarks #7

Run benchmarks locally

Greetings, @cipher982! I've seen the benchmark application https://www.llm-benchmarks.com/local and it looks great! I'm currently working on a competitive analysis of this 4 backends: Transformers…

daniil-lyakhov updated 4 months ago
9
langchain-ai/langchain #24078

LLaVA model error in VLLM through Langchain

### Checked other resources - [X] I added a very descriptive title to this issue. - [X] I searched the LangChain documentation with the integrated search. - [X] I used the GitHub search to find a sim…

tsantra updated 2 months ago
2
flexflow/FlexFlow #1130

Questions about the measurement of the latency

Hi FlexFlow team, I used the methods mentioned in #1099 to test the latency（GPU: RTX-4090）, but i get a confused result： 1）LLaMA-7B + 1个SSM(llama-160M), latency: 25.1 s 2）LLaMA-7B(without ssms), la…

ChuanhongLi updated 1 month ago
14
vllm-project/vllm #5827

[Bug]: Internal Server Error when hosting Alibaba-NLP/gte-Qw…

### Your current environment Using latest available docker image: vllm/vllm-openai:v0.5.0.post1 ### 🐛 Describe the bug I am getting as response "Internal Server Error" when calling the /v1/embedd…

markkofler updated 2 months ago
1
irthomasthomas/undecidability #645

LoRA Land: Fine-Tuned Open-Source LLMs that Outperform GPT-4…

- [ ] [LoRA Land: Fine-Tuned Open-Source LLMs that Outperform GPT-4 - Predibase - Predibase](https://predibase.com/blog/lora-land-fine-tuned-open-source-llms-that-outperform-gpt-4) # LoRA Land: Fine…

irthomasthomas updated 6 months ago
1
vllm-project/vllm #2942

Multi-GPU Support Failures with AMD MI210

Hello. Thank for providing vLLM as a great open-source tool for inference and model serving! I was able to build vLLM on a cluster I maintain, but it only appears to work on a single MI210 GPU. Can so…

tom-papatheodore updated 2 months ago
6

上一页 1...12 13 14 15 16 17 18...100 下一页

1000+ results for llm-serving

1000+ results
for llm-serving