model-parallel Search Results

1000+ results
for model-parallel

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

huggingface/optimum-neuron #735

AttributeError: can't set attribute 'deepspeed_plugin'

### System Info ```shell accelerate 1.1.1 neuronx-cc 2.14.227.0+2d4f85be neuronx-distributed 0.8.0 neuronx-distributed-training 1.0.0 optimum …

anushka0415 updated 5 days ago
3
vllm-project/vllm #10283

[Bug]: LLM initialization time increases significantly with …

### Your current environment vllm 0.5.2 The output of `python collect_env.py` ```text Collecting environment information... PyTorch version: 2.3.1+cu121 Is debug build: False CUDA used to b…

piood updated 1 week ago
4
ServerlessLLM/ServerlessLLM #157

[BUG] Does not support using multiple GPUs in current vLLM p…

### Prerequisites - [X] I have read the [ServerlessLLM documentation](https://serverlessllm.github.io/). - [X] I have searched the [Issue Tracker](https://github.com/ServerlessLLM/ServerlessLLM/issue…

attteegood updated 1 week ago
1
sgl-project/sglang #1673

[Feature] Make vLLM optional in model code

### UPDATE(11/23/2024) Currently, @james-p-xu is removing rope, @yizhang2077 is removing distributed, @HandH1998 is removing weight loader. Optimistically, we can remove these dependencies by the…

ByronHsu updated 1 day ago
1
LC1332/Chat-Haruhi-Suzumiya #83

使用vllm数据并行和ChatHaruhi一起使用会报RuntimeError: Cannot re-initializ…

我的代码 from vllm import LLM, SamplingParams from chatharuhi import ChatHaruhi (这里只要导入ChatHaruhi就会报Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'sp…

545771889a updated 3 weeks ago
1
vllm-project/vllm #9670

[Bug]: Input length greater than 32K in nvidia/Llama-3.1-Nem…

### Your current environment Running via Docker ```text docker run --runtime nvidia --gpus \"device=${CUDA_VISIBLE_DEVICES}\" --shm-size 8g -v $volume:/root/.cache/huggingface …

source-ram updated 3 days ago
5
vllm-project/vllm #7801

[Bug]: Running mistral-large results in an error related to …

### Your current environment ```tex The environment is the latest vllm-0.5.4's docker environment, and the command to run is:python3 api_server.py --port 10195 --model /data/models/Mistral-Large-Ins…

White-Friday updated 3 days ago
3
embeddings-benchmark/mteb #1422

Evaluating LLM2Vec fails

``` 2024-11-09 21:39:44.994636: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already b…

Muennighoff updated 1 week ago
1
vllm-project/vllm #10429

[Bug]: rocm issue

### Your current environment AMD radon + kubernetes ### Model Input Dumps `vllm serve mistralai/Mistral-7B-Instruct-v0.3 --trust-remote-code --enable-chunked-prefill --max_num_batch…

YYXLN updated 6 days ago
1
bpmn-io/bpmnlint #151

Detect deadlocking parallel gateway

### The rule detects the following modeling patterns * Detect when the number of incoming flows of a parallel gateway does not match the number of outgoing flows of the closest parallel gateway…

barmac updated 1 week ago
2

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for model-parallel

1000+ results
for model-parallel