llm-serving Search Results

1000+ results
for llm-serving

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

anarchy-ai/LLM-VM #376

Load-balancing / auto-scaling for LLM serving on Azure

VictorOdede updated 8 months ago
5
anarchy-ai/LLM-VM #363

Load-balancing / auto-scaling for LLM serving on AWS

VictorOdede updated 8 months ago
3
eclipse-openj9/openj9 #19552

TriagerX - AI-Assisted Issue Triage and Assignment

Objective: TriagerX is a novel AI-enabled software analytics tool that we developed via the IBM CAS project (with Dr. Uddin). TriagerX aims to assign an issue to components/teams and developers and to…

llxia updated 1 day ago
3
vllm-project/vllm #10478

[Bug]: vLLM CPU mode broken Unable to get JIT kernel for brg…

### Your current environment The output of `python collect_env.py` ```text Collecting environment information... PyTorch version: 2.5.1+cpu Is debug build: False CUDA used to build PyTorch…

samos123 updated 1 week ago
10
databricks/terraform-provider-databricks #4218

[FEATURE] databricks_model_serving to support instance_profi…

It is recommended to use Instance profile instead of Accesskey and Secret Key to configure Amazon Bedrock as served entity in model serving endpoint. ### Use-cases ### Attempted Solutions …

sasidhar-aws updated 2 weeks ago
2
casys-kaist/LLMServingSim #4

ERROR 'Layer' object has no attribute 'input_memory_loc'

I am currently using the LLM Serving Sim paper to see how this simulator works, and I am using it to figure out the results of the simulator according to the settings. However, the following error …

GyeonggeunJung updated 2 months ago
1
intel-analytics/ipex-llm #10690

provide support for model serving using FastAPI deepspeed+ip…

Hi, Could you please help provide guide on integrating deepspeed approach of using multi-GPU Intel Flex 140 to run model inference using FastAPI and uvicorn setting ? model id: 'meta-llama/Llama-2-7…

nazneenn updated 7 months ago
3
intel-analytics/ipex-llm #11956

Running vLLM service benchmark(4xARC770) with Qwen1.5-32B-C…

Environment: Platform: 6548N+4ARC770 Docker Image: intelanalytics/ipex-llm-serving-xpu:2.1.0 servicing script: ![image](https://github.com/user-attachments/assets/3949f088-d83f-4844-9ab3-0f0c986…

dukelee111 updated 3 months ago
1
redhat-et/foundation-models-for-documentation #46

Serving LLMs on Open Data Hub / RHODS with ModelMesh

How can we create model endpoints using Modelmesh on the cluster? This should involve instructions on deploying modelmesh using Operator Hub and adding the model to a s3 bucket. Create the predi…

codificat updated 8 months ago
1
redhat-et/foundation-models-for-documentation #47

text-generation-webui as a platform for serving LLMs

Explore [text-generation-webui](https://github.com/oobabooga/text-generation-webui/) to serve language models on ODH. How flexible is it for different models? Would we need a custom server.py f…

codificat updated 8 months ago
1

上一页 1...5 6 7 8 9 10 11...100 下一页

1000+ results for llm-serving

1000+ results
for llm-serving