model-server Search Results

1000+ results
for model-server

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

meta-llama/llama-stack #331

How to specify the model type using the pre-build docker?

### System Info Using Windows 11 ### Information - [X] The official example scripts - [X] My own modified scripts ### 🐛 Describe the bug When running: ``` docker run -it -p 5000:5000 -v C:/Us…

Travis-Barton updated 1 day ago
9
pytorch/TensorRT #3248

🐛 [Bug] Error when serving Torch-TensorRT JIT model to Nvidi…

## Bug Description I'm trying to serve torch-tensorrt optimized model to Nvidia Triton server based on the provided tutorial https://pytorch.org/TensorRT/tutorials/serving_torch_tensorrt_with_t…

zmy1116 updated 2 weeks ago
3
drudilorenzo/generative_agents #10

Issues with different versions of GPT

In `reverie/backend_server/persona/prompt_template/run_gpt_prompt.py`, multiple requests to OpenAI are made with a hardcoded model `gpt-35-turbo-0125`, which is currently not a valid/supported model o…

martin-krutsky updated 6 days ago
1
mlcommons/cm4mlops #531

limit mlperf inference samples using cm

Hello, I hope you are doing well. I intend to run resnet50 in the server scenario (datacenter) using the script in the docs: ``` cm run script --tags=run-mlperf,inference,_r4.1-dev \ --mode…

Arman5592 updated 5 hours ago
1
envoyproxy/gateway #4431

Add support for with_request_body in SecurityPolicy.spec.ext…

Envoy supports sending the full request body to the external authorization server via the with_request_body filter configuration. Do you think that it is possible to expose such feature on the Securit…

mjf-89 updated 1 week ago
11
intel-analytics/ipex-llm #12374

Several GPU models behave erratically compared to CPU execut…

Here is a trace from my Intel Arc A770 via Docker: ``` $ ollama run deepseek-coder-v2 >>> write fizzbuzz """"""""""""""""""""""""""""""" ``` And here is an trace from Arch linux running on …

pepijndevos updated 1 day ago
1
vllm-project/vllm #10129

[Bug]: can not serve microsoft/llava-med-v1.5-mistral-7b

### Your current environment The output of `python collect_env.py` ```text Collecting environment information... PyTorch version: 2.4.0+cu121 Is debug build: False CUDA used to build PyTorch…

cubense updated 4 days ago
1
ollama/ollama #7267

Running out of memory when allocating to second GPU

### What is the issue? No issues with any model that fits into a single 3090 but seems to run out of memory when trying to distribute to the second 3090. ``` INFO [wmain] starting c++ runner | ti…

joshuakoh1 updated 3 weeks ago
5
dagster-io/dagster #25308

Better config for jobs with sidecar containers and DynamicOu…

### What's the use case? I have a job with a dynamic graph, using DynamicOut. The ops are configured with Pydantic configs, allowing us to parametrize the ops in the launchpad. In each Op, we…

michaelromagne updated 3 weeks ago
1
langgenius/dify #10577

API Call Error: 'invalid_param' in Chat Flow with Ollama LLM…

### Self Checks - [X] This is only for bug report, if you would like to ask a question, please head to [Discussions](https://github.com/langgenius/dify/discussions/categories/general). - [X] I have s…

aqdmdpsl1996 updated 14 hours ago
3

上一页 1...7 8 9 10 11 12 13...100 下一页

1000+ results for model-server

1000+ results
for model-server