llm-ops Search Results - Githubissues

1000+ results
for llm-ops

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

vllm-project/vllm #2942

Multi-GPU Support Failures with AMD MI210

Hello. Thank for providing vLLM as a great open-source tool for inference and model serving! I was able to build vLLM on a cluster I maintain, but it only appears to work on a single MI210 GPU. Can so…

tom-papatheodore updated 1 month ago
6
imoneoi/openchat #19

crash in VLLM

Trying to install it to NVidia's pytorch contaner. I'm getting this while running. Same issue while trying to install it to Lambda GPU cloud on H100 instance. (all default) ``` root@0971a018b7ec…

apkuhar updated 9 months ago
4
vllm-project/vllm #4293

[Bug]: Engine iteration timed out. This should never happen …

### Your current environment ``` Collecting environment information... PyTorch version: 2.2.1+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: …

blackblue9 updated 3 months ago
6
vllm-project/vllm #7382

[Bug]: LLaMa 3.1 8B/70B/405B all behave poorly and different…

### Your current environment Docker latest 0.5.4 ``` docker pull vllm/vllm-openai:latest docker run -d --restart=always \ --runtime=nvidia \ --gpus '"device=0"' \ --shm-size=10.…

pseudotensor updated 3 days ago
18
vllm-project/vllm #5190

[Bug]: loading squeezellm model

### Your current environment I used 0.4.3 version, pip install, cuda vsesion 12.0, A100 GPU RuntimeError: t == DeviceType::CUDA INTERNAL ASSERT FAILED ### 🐛 Describe the bug ``` INFO 06-02 03…

yuhuixu1993 updated 3 months ago
1
artyom-beilis/pytorch_dlprim #54

Request to support more functions!

``` D:\MiniCPM\venv\Lib\site-packages\torch\_tensor.py:962: UserWarning: The operator 'aten::pow.Scalar_out' is not currently supported on the ocl backend. Please open an issue at for requesting supp…

CCMKCCMK updated 7 months ago
1
tenstorrent/tt-metal #7019

Split up device perf CI

Following the conversation on slack regarding the device Perf CI responsibility, to be able to better distribute CI monitoring between various model owners, CI has to be split into multiple jobs. Ini…

mo-tenstorrent updated 1 week ago
11
langchain-ai/langgraph #137

How to stream token of agent response in agent supervisor?

### Checked other resources - [X] I added a very descriptive title to this issue. - [X] I searched the LangChain documentation with the integrated search. - [X] I used the GitHub search to find a sim…

chatgptguru updated 4 months ago
14
pytorch/xla #6766

How to implement parrallel training across TPU device with X…

I found the latest opensource LLM from google: Gemma has two version of model structure. 1. https://github.com/google/gemma_pytorch/blob/main/gemma/model_xla.py 2. https://github.com/google/gemma_…

Mon-ius updated 5 months ago
13
microsoft/onnxscript #1309

Support for collectives / MPI primitives

Hi, great work around onnxscript! I was wondering whether somewhere down the road collective operations / MPI primitives like `reduce_scatter`, `all_gather` will be added? If not, I'd be very curi…

vincent-mayer updated 5 months ago
3

上一页 1...7 8 9 10 11 12 13...100 下一页

1000+ results for llm-ops

1000+ results
for llm-ops