spmd Search Results - Githubissues

1000+ results
for spmd

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

BinomialLLC/basis_universal #366

Compile error in cppspmd_sse.h with C++23

When compiling with c++23 the following errors are reported in cppspmd_sse.h: ``` In file included from /media/dezlow/Drive/Dev/C++/Oneiro/ThirdParty/KTX/lib/basisu/encoder/basisu_kernels_sse.cpp:…

MarkCallow updated 10 months ago
1
openxla/xla #13306

HandleReshape in SPMD partitioner splits sharding in some ca…

In SPMD partitioner in the HandleReshape method there are a couple of cases I have found where the sharding of a tensor is split. This happens in gradient accumulation case in Jax. Example The p…

ptoulme-aws updated 5 months ago
2
pytorch/xla #4824

GSPMD + PyTorch Compile + TPU crash

Hi, I am trying to combine both GSPMD + PyTorch Compile, but it doesn't work. I took a copy of the test script "test_train_spmd_imagenet.py" and test it in colab, and it started normally. However,…

agemagician updated 4 weeks ago
4
pytorch/xla #6660

Loading a model trained with multi-core TPU into a SPMD mode…

Hello I have models trained using multi-core TPU option. I have saved their checkpoints using ``` import torch_xla.core.xla_model as xm xm.save(checkpoint, path_checkpoints_file, master_only…

mfatih7 updated 7 months ago
22
BinomialLLC/basis_universal #242

clang 12 gives 14 deprecated copy constructor warnings in en…

These 14 warnings are new with clang 12. Many of the projects I am importing have or had this problem including `vulkan.hpp`, now fixed. ``` In file included from /Users/mark/Projects/khronos/gith…

MarkCallow updated 3 years ago
2
pytorch/xla #3755

[RFC] A high-level GSPMD API in PT/XLA (based on `xs.mark_sh…

## 🚀 [RFC] A high-level GSPMD API in PT/XLA (based on `xs.mark_sharding`) This RFC proposes a high-level API for GSPMD through a wrapper class and a partitioning rule function, based on `xs.mark_sh…

ronghanghu updated 5 months ago
17
pytorch/xla #7855

How to sync TPUs when using a pod with more than 1 VM in SPM…

## ❓ Questions and Help Generally we feel that since in SPMD most of the work is under the hood its hard to understand what is required from us when using it in order to sync between TPUs on a pod …

dudulightricks updated 3 months ago
3
chapel-lang/chapel #5722

Improve MPI Module

Next steps for the MPI module. ## Features - [ ] Extend functionality - Extend to MPI-2 and 3 routines - We currently support most of MPI-1, and a random subset of MPI-2 and MPI-3.…

ben-albrecht updated 3 years ago
4
RBigData/tasktools #1

Error when setting "preschedule=FALSE" and using more than 2…

Hi Drew, Hope you are doing well! I am using task_tools for our submission, but when I set `preschedule=FALSE` and used 25 nodes with more than 300K tasks to execute, I got the following error: ``…

arodri7 updated 2 years ago
1
openxla/xla #14600

Channel ids for collectives are not unique when custom shard…

We are seeing a compilation crash with a custom partitioning defined by the user. I'm attaching the repro script and instructions to repro. The error happens when running with TransformerEngine in ou…

Tixxx updated 3 months ago
13

上一页 1...4 5 6 7 8 9 10...100 下一页

1000+ results for spmd

1000+ results
for spmd