tensor-parallelism Search Results

1000+ results
for tensor-parallelism

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

microsoft/DeepSpeed #3826

[REQUEST] How to do 3D Parallelism for HuggingFace Models?

# Overview Does DeepSpeed leverage the 3D parallelism (i.e. data parallelism + pipeline parallelism + tensor parallelism) for huggingface models (e.g., GPT-J, LLaMA) fine-tuning? May I ask anybody k…

jacklanda updated 1 year ago
3
ChufanSuki/read-paper-and-code #1

CVPR 2024 | DistriFusion: Distributed Parallel Inference for…

https://arxiv.org/abs/2402.19481 https://github.com/mit-han-lab/distrifuser

ChufanSuki updated 4 months ago
3
microsoft/Megatron-DeepSpeed #360

Fine-tune llama2 with sequence parallelism

Hi, I am trying to finetune a llama2 model with sequence parallelism using Megatron-DS. Is there any documentation for this ?

AnirudhVIyer updated 3 months ago
3
aws-neuron/aws-neuron-sdk #749

Llama-2-13b inference example failing on inf2

I have inf2.24xlarge and I am running the Llama-2 inference example. All the packages are installed latest. Everything worked fine until the step where I load model with tp_degree = 24 and it faile…

Rajmehta123 updated 1 month ago
15
deepmodeling/deepmd-kit #3766

Issue with dp --pt test and validation dataset size

### Bug summary Encountered an issue when using the "descriptor": "dpa2" to train a model from scratch for 500k steps and then testing the model on a merged validation dataset. The merged validatio…

PhelanShao updated 3 weeks ago
10
pytorch/pytorch #130137

How to get stream operators in custom backend compiler ?

### 🐛 Describe the bug Hi, when I use a custom backend, I find that the fx graph that custom compiler gets does not have the stream related operations. Then I found that the fx graph dropped those…

wbigat updated 22 hours ago
15
vllm-project/vllm #3172

[Feature Request] Way to specify GPU ordinal

Hello. I am currently employing the vllm library alongside dataparallel for my projects. Up until version 0.2.6, it was feasible to designate specific GPUs for each worker explicitly, which was i…

starmpcc updated 3 weeks ago
5
open-compass/opencompass #1002

[Feature] 请问使用vllm评测时怎么实现类似HF多卡数据并行？

### 描述该功能我在评测时的模型type 为vllm，参数如下： ![image](https://github.com/open-compass/opencompass/assets/97608046/851bccbb-1f7f-420c-b7ca-fc00677a12cf) 但是显卡占用只使用了一张卡来评测任务 ![image](https://github.com/open-…

noforit updated 1 month ago
16
vllm-project/vllm #3750

[Usage]: Is it possible to pin `LLM` to a specific CUDA devi…

### Your current environment - ### How would you like to use vllm I'd like to use multiple vllm instances in the same python script, each on a different CUDA device. Is it possible to pin an `LLM` …

mgerstgrasser updated 4 weeks ago
8
pytorch/pytorch #116101

[DTensor] Support API to shard to parent mesh

**Context** To compose per-parameter-sharding FSDP with `DTensor`-based tensor parallelism, we need to reshard an existing `DTensor` to its parent mesh and include the FSDP dim-0 sharding. The cur…

awgu updated 7 months ago
3

上一页 1...5 6 7 8 9 10 11...100 下一页

1000+ results for tensor-parallelism

1000+ results
for tensor-parallelism