spmd Search Results - Githubissues

1000+ results
for spmd

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

pytorch/xla #7890

In spmd training of multiple machines, xp.trace is problemat…

## ❓ Questions and Help I printed all the thunk that was executed and found that there were a lot of thunk that didn't appear in my tensorboard. And the order of the front and back is also wrong. I …

mars1248 updated 3 months ago
4
pytorch/pytorch #76395

[checkpoint] SPMD distributed checkpoint coordination

### 🚀 The feature, motivation and pitch The current implementation provides no coordination around completion or failure. A call to `load_state_dict` or `save_state_dict` should complete either wh…

kumpera updated 2 years ago
2
pytorch/xla #6646

Pipeline parallelism with SPMD

### 🚀 The feature, motivation and pitch ### Motivation SPMD sharding in pytorch/XLA offers model parallelism by sharding tensors within an operator. However, we need a mechanism to integrate thi…

amithrm updated 9 months ago
3
pytorch/xla #6322

[RFC] PyTorch/XLA Auto-Sharding API

# 🚀 Feature & Motivation PyTorch/XLA recently launched PyTorch/XLA SPMD ([RFC](https://github.com/pytorch/xla/issues/3871), [blog](https://pytorch.org/blog/pytorch-xla-spmd/), [docs/spmd.md](https:…

yeounoh updated 3 months ago
13
ESCOMP/SimpleLand #52

Remove use of MCT in GetGlobalValuesMod, decomp*, spmd* in c…

CTSM went through a similar transformation, and we can likely use it as a guide to make this happen. But, the uses of MCT modules in these core SLIM modules needs to be removed. This is something that…

ekluzek updated 2 months ago
2
openxla/stablehlo #1595

Sharding Propagation and SPMD partitioning in StableHLO

### Request description In XLA there are the [sharding propagation pass](https://github.com/openxla/xla/blob/2eba54a187e03ccd0f65669234b80966bdbcda5e/xla/service/sharding_propagation.h#L66) and [SP…

sogartar updated 8 months ago
2
openxla/xla #24

Support SPMD sharding of `fft` ops.

In https://github.com/google/jax/issues/13081 we found that XLA doesn't support SPMD sharding of fast-fourier transform ops. It should!

hawkinsp updated 9 months ago
4
keras-team/keras #19519

🗺️ Keras Development Roadmap

Here's an overview of the features we intend to work on in the near future. ## Core Keras ### Saving & export - Implement saving support for sharded models (sharded weights files). - Improve…

fchollet updated 1 month ago
12
pytorch/xla #7196

Distributed spmd training with multiple compilations

## ❓ Questions and Help When starting gpu spmd training with `torchrun`, why does it need to be compiled once per machine? Although the resulting graph is the same. Is there any way to avoid it

mars1248 updated 5 months ago
4
openxla/xla #15207

Involuntary Full Rematerialization

Ported this issue from https://github.com/google/jax/issues/21562 This code ```python import jax import numpy as np import jax.numpy as jnp from jax.sharding import PartitionSpec as PS, Name…

chaserileyroberts updated 4 months ago
2

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for spmd

1000+ results
for spmd