llm-serving Search Results

1000+ results
for llm-serving

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

opea-project/GenAIComps #831

[RFC] OPEA Inference Microservices Integration for LangChain…

# OPEA Inference Microservices Integration for LangChain This RFC proposes the integration of OPEA inference microservices (from GenAIComps) into LangChain [extensible to other frameworks], enabli…

avinashkarani updated 3 weeks ago
2
anming81/image-mirror #3

intelanalytics/ipex-llm-serving-cpu:latest

intelanalytics/ipex-llm-serving-cpu:latest

anming81 updated 4 months ago
1
The-Trust-Assembly/trust-assembly #14

Research spike: determine cloud provider

How should we host the wiki + API backend? Choices I see right now: - Coolify self-hosting on a Hetzner remote machine - Render - Vercel - Fly.io - Others? We initially thought supabase cou…

adhurjaty updated 4 days ago
1
kserve/kserve #3976

Kserve RawDeployment - Llama model

/kind bug **What steps did you take and what happened:** [A clear and concise description of what the bug is.] I installed Kserve in k8s following steps here https://kserve.github.io/website/late…

rasnarams updated 2 weeks ago
1
langchain-ai/langserve #791

⚠️ Recommending LangGraph Platform for new projects

We have [recently announced](https://blog.langchain.dev/langgraph-platform-announce/) LangGraph Platform, a ***significantly*** enhanced solution for deploying agentic applications at scale. We rec…

eyurtsev updated 6 days ago
1
run-llama/llama_index #13507

[Feature Request]: Add Bedrock multimodal LLM integration

### Feature Description AWS Bedrock has a few multimodal LLMs such as Claude Opus. It would be great if this can be added as a multi-modal-llm integration. There is already an anthropic multimodal …

tituslhy updated 2 days ago
4
vllm-project/vllm #10086

[Feature]: Enhance integration with advanced LB/gateways wit…

### 🚀 The feature, motivation and pitch There are huge potential in more advanced load balancing strategies tailored for the unique characteristics of AI inference, compared to basic strategies such …

liu-cong updated 1 week ago
3
intel-analytics/ipex-llm #12288

llava-hf/llava-1.5-7b-hf: error when multi-turn chat with mu…

``` from ipex_llm import optimize_model from transformers import LlavaForConditionalGeneration model = LlavaForConditionalGeneration.from_pretrained('llava-hf/llava-1.5-7b-hf', device_map="cpu") m…

Johere updated 2 weeks ago
6
vllm-project/vllm #9429

[Misc]: [Question] vLLM's model loading & instance contract,…

### Anything you want to discuss about vllm. Can vLLM consider support multiple models on the same vLLM instance? We are evaluating using vLLM for large scale LLM inference serving. But we are con…

yx-lamini updated 1 month ago
1
NVIDIA/TensorRT-LLM #924

Mixtral-8x7b-instruct-0.1 build fails with TypeError: LoraCo…

**Setup** Machine: AWS Sagemaker ml.p4d.24xlarge Model: https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1 Used Docker container image with the latest build of trt-llm (`0.8.0.dev2024011…

ajamjoom updated 1 week ago
4

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for llm-serving

1000+ results
for llm-serving