issues
search
pytorch
/
torchdistx
Torch Distributed Experimental
BSD 3-Clause "New" or "Revised" License
116
stars
31
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
torchrun with deferred_init hang
#83
JsBlueCat
opened
7 months ago
0
support torch 2.1 dtensor
#82
JsBlueCat
opened
8 months ago
3
refactor lazy init to device-agnostic
#81
atalman
opened
9 months ago
0
Support PyTorch 2.1.0
#80
Seventeen17
opened
1 year ago
6
Build from source fails
#79
goswamig
opened
1 year ago
2
Add script to install cudnn with CUDA 11.7
#78
sheilaliuxl
opened
1 year ago
1
Is this project dead?
#77
eric-mitchell
opened
1 year ago
0
Add CUDA 11.7 & 11.8 to conda packaging
#76
cbalioglu
closed
1 year ago
0
Unblock CI
#75
cbalioglu
closed
1 year ago
0
Use new PyObjectSlot API
#74
jamesr66a
opened
1 year ago
3
Unable to build torchdistx for PT 2.0
#73
Vatshank
opened
1 year ago
1
Update reference to torch::TypeError
#72
mehtanirav
opened
1 year ago
2
de-materialize fake tensor/module
#71
GuanhuaWang
opened
1 year ago
0
Update requirements.txt
#70
Youssef1313
closed
2 years ago
2
No Suitable Distribution for PyTorch 1.14
#69
awgu
opened
2 years ago
1
AnyPrecision optimizer dynamic casting
#68
atturaioe
opened
2 years ago
0
Make python sub-packages visible
#67
atturaioe
opened
2 years ago
3
Python sub-packages are not visible
#66
atturaioe
opened
2 years ago
0
[AnyPrecision optimizer] add automatic BF16 support check (network and gpu)
#65
lessw2020
opened
2 years ago
0
Fix optimizers import
#64
atturaioe
closed
2 years ago
3
Optimizers dir doesn't have __init__.py
#63
atturaioe
closed
2 years ago
1
Add unittests for AnyPrecisionOptimizer
#62
rohan-varma
closed
2 years ago
0
Documentation for AnyPrecisionOptimizer
#61
rohan-varma
opened
2 years ago
3
Move optimizers to the Python dir
#60
cbalioglu
closed
2 years ago
3
[AnyPrecision optimizer] consider FP32 defaults, possibly automated via BF16 support check
#59
lessw2020
opened
2 years ago
1
[AnyPrecision optimizer] Kahan compensation buffer should be stored in state dict for checkpointing
#58
lessw2020
opened
2 years ago
0
[AnyPrecision optimizer] - needs unit tests
#57
lessw2020
closed
2 years ago
0
Patch for GossipGraD algorithm
#56
aovladi
closed
2 years ago
0
Add expecttest to CI requirements
#55
cbalioglu
closed
2 years ago
0
Remove CUDA 11.6 builds from nightlies
#54
cbalioglu
closed
2 years ago
0
Introduce `is_deferred` convenience function
#53
cbalioglu
closed
2 years ago
1
Add Bfloat16 optimizer with Kahan summation option for high precision updates
#52
lessw2020
closed
2 years ago
23
Add SymInt support
#51
pbelevich
closed
2 years ago
1
deferred_init on HF models regression: `aten::expand` has an argument of type `SymInt` which is not supported in a deferred-init context
#50
pbelevich
closed
2 years ago
0
Utility to check if a module needs to be materialized
#49
carmocca
closed
2 years ago
2
GossipGraD implementation
#48
aovladi
closed
2 years ago
4
Adjustments to SlowMo to work with recent changes in PyTorch
#47
aovladi
closed
2 years ago
0
Introduce `fake_cuda`
#46
cbalioglu
closed
2 years ago
2
Disable TSan builds to unblock slowMo tests (3)
#45
cbalioglu
closed
2 years ago
0
Disable TSan builds to unblock slowMo tests (2)
#44
cbalioglu
closed
2 years ago
0
Tsan
#43
cbalioglu
closed
2 years ago
0
Disable TSan builds to unblock slowMo tests
#42
cbalioglu
closed
2 years ago
0
Refactor `enableFakeMode`
#41
cbalioglu
closed
2 years ago
0
Improve type annotation for `deferred_init`
#40
cbalioglu
closed
2 years ago
0
torchdistx compiled with `tsan` sanitizer hangs on specific imports
#39
aovladi
closed
2 years ago
0
Refactor `enableDeferredInit`
#38
cbalioglu
closed
2 years ago
0
updated requirements for future distributed tests run
#37
aovladi
closed
2 years ago
0
Slow Momentum implementation
#36
aovladi
closed
2 years ago
2
Check meta op output
#35
cbalioglu
closed
2 years ago
0
Point the latest doc to v0.2.0
#34
cbalioglu
closed
2 years ago
0
Next