spmd Search Results - Githubissues

1000+ results
for spmd

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

chapel-lang/chapel #14405

Enable SPMD programming within a forall loop

This issue proposes that the Chapel language / module code be adjusted to allow for certain SPMD idioms within forall loops. This proposal is intended to meet both the needs of programmers wishing to …

mppf updated 4 years ago
12
pytorch/xla #8402

Kaggle Notebook: model return loss None on TPU

## ❓ Questions and Help Hi, I recieved loss None when training model. Anyone can help? Simple reproduct kaggle notebook [link](https://www.kaggle.com/code/liondude/notebook548442067d) ``` im…

manh3152924 updated 20 hours ago
1
pytorch/PiPPy #283

Share DeviceMesh between PiPPy and SPMD

I see there is a DeviceMesh abstraction in `spmd`: https://github.com/pytorch/PiPPy/blob/main/spmd/tensor/device_mesh.py Can we use this abstraction as shared infrastructure? For example, `Pipeline…

jamesr66a updated 2 years ago
5
pytorch/xla #5025

[RFC] Checking SPMD property for graphs generated by PyTorch

## 🚀 Feature Check that the graphs generated by PyTorch FSDP are SPMD. ## Motivation We have encountered scenarios for distributed training where the graphs generated by PyTorch are not SPMD. I…

pratnali-aws updated 5 months ago
5
pytorch/xla #7843

Test Script Not Utilizing All XLA Devices

## ❓ Questions and Help I'm running this official [script here](https://github.com/pytorch/xla/blob/master/test/test_train_mp_imagenet_fsdp.py), but I only see two xla devices being used, xla:0 and…

radna0 updated 3 months ago
1
vllm-project/vllm #7528

[Tracking issue] [Help wanted]: Multi-step scheduling follow…

Co-authored with @SolitaryThinker @Yard1 @rkooo567 We are landing multi-step scheduling (#7000) to amortize scheduling overhead for better ITL and throughput. Since the first version of multi-step…

comaniac updated 1 month ago
5
BinomialLLC/basis_universal #366

Compile error in cppspmd_sse.h with C++23

When compiling with c++23 the following errors are reported in cppspmd_sse.h: ``` In file included from /media/dezlow/Drive/Dev/C++/Oneiro/ThirdParty/KTX/lib/basisu/encoder/basisu_kernels_sse.cpp:…

MarkCallow updated 11 months ago
1
alpa-projects/alpa #560

[BUG] PipeshardParallel crashes when apply_grad part is empt…

Change this line https://github.com/alpa-projects/alpa/blob/ea50a4328064a2a4eeae9101b65058a21ba112b8/tests/pipeline_parallel/test_mlp.py#L34 from `jax.grad` to `alpa.grad`. I got this error ``` WAR…

merrymercy updated 2 years ago
1
llvm/llvm-project #53859

[OpenMP] Reductions with non-trivial types (e.g., complex) c…

SPMD-zation is required for good performance and the copy constructor used for non-trivial types can cause us to miss out on it. While SPMD-zation has conceptual limitations right now, the cases I'…

jdoerfert updated 2 years ago
2
pytorch/xla #6778

Spmd pre-training llama2 multi-machine training so slow?

spmd has a normal training speed using eight blocks on a single machine, but the communication overhead increases rapidly in the case of multiple machines device is： gpu：A100 * 8 * 2 spmd strategy …

mars1248 updated 5 months ago
23

上一页 1...3 4 5 6 7 8 9...100 下一页

1000+ results for spmd

1000+ results
for spmd