issues
search
Bluefog-Lib
/
bluefog
Distributed and decentralized training framework for PyTorch over graph
https://bluefog-lib.github.io/bluefog/
Apache License 2.0
291
stars
71
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Bump torch from 1.4.0 to 2.2.0
#118
dependabot[bot]
opened
2 months ago
0
compressor
#117
xuyufei-a
opened
4 months ago
0
Work with newer torch
#116
fecet
opened
1 year ago
0
Fixing verification of the hearbeat value
#115
dgumenyuk
closed
1 year ago
1
Argument "disable_heartbeat" does not exist
#114
dgumenyuk
closed
1 year ago
1
Is it possible to run more agents than the number of my CPU cores?
#113
1qzhworld
closed
2 years ago
2
Error when calling push-sum optimizer
#112
yangxuanfei
opened
2 years ago
3
Problems running decentralized trainning
#111
yangxuanfei
closed
2 years ago
0
ImportError: /root/miniconda3/envs/bluefog/lib/python3.8/site-packages/bluefog/torch/mpi_lib.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZNK2at6Tensor6deviceEv
#110
yangxuanfei
opened
2 years ago
3
Add ref
#109
kunyuan827
closed
2 years ago
0
some error happened
#108
northhj
closed
2 years ago
3
when I Install Bluefog from Pip (GPU),some error happens
#107
lkzs
closed
2 years ago
6
Mypy
#106
hanbinhu
opened
2 years ago
0
Update README.rst
#105
BichengYing
opened
2 years ago
0
when run "Applying BlueFog on Deep Learning problem(High Level API Introduction)",some error happened
#104
lkzs
opened
2 years ago
1
Model trained by AWC style cannot be saved
#103
kunyuan827
opened
2 years ago
0
test_neighbor_allreduce_dst_weight_fusion failed with MPI CUDA Aware case
#102
BichengYing
opened
2 years ago
0
Fix the cuda stream creation in MPI
#101
Bluefog-Lib
closed
2 years ago
0
CUDA initialized even when user didn't use CUDA at all in an environment with GPUs
#100
hanbinhu
opened
2 years ago
0
Add -mca ^openib flag to test
#99
BichengYing
closed
2 years ago
0
add Troubleshooting & doc bugfix
#98
ymchen7
closed
2 years ago
0
Add BlueFog arxiv paper
#97
BichengYing
closed
2 years ago
0
Add hierarchical related ops test
#96
BichengYing
closed
3 years ago
1
Allow to control local size by environment variable
#95
BichengYing
closed
3 years ago
0
Docker release
#94
hanbinhu
closed
3 years ago
0
Release flow
#93
hanbinhu
closed
3 years ago
0
Condition variable
#92
BichengYing
closed
3 years ago
0
Add deprecation args and fix the comments in neighbor_allreduce
#91
BichengYing
closed
3 years ago
0
Check the topology is the same cross all agents when call set_topology
#90
BichengYing
opened
3 years ago
0
Disable heartbeat by default
#89
BichengYing
closed
3 years ago
0
Add condition variable to control the loop
#88
BichengYing
closed
3 years ago
0
Condition variable
#87
BichengYing
closed
3 years ago
1
Topo service
#86
BichengYing
closed
3 years ago
0
Revert "Topo service (#75)"
#85
BichengYing
closed
3 years ago
0
No-op: format topo file only by Black
#84
BichengYing
opened
3 years ago
0
Better design pattern for data_weight synchronization
#83
hanbinhu
opened
3 years ago
0
Symmetrical argument for self_weight, src_weights, dst_weights
#82
hanbinhu
opened
3 years ago
0
Add dst_weight for hierarchical neighbor allreduce
#81
hanbinhu
opened
3 years ago
0
Context for CUDA and NCCL
#80
hanbinhu
opened
3 years ago
1
Benchmark Example issue
#79
hanbinhu
opened
3 years ago
0
Improve neighbor allreduce
#78
hanbinhu
closed
3 years ago
0
Create doc.yml
#77
BichengYing
closed
3 years ago
0
Add github action
#76
BichengYing
closed
3 years ago
0
Topo service
#75
BichengYing
closed
3 years ago
0
Dynamic neighbor allgather
#74
BichengYing
closed
3 years ago
0
Update intra communication from MPI window operations to shared memory
#73
lucweichen
opened
3 years ago
0
Add API to unregister window
#72
hanbinhu
closed
3 years ago
1
ATC multi-step case
#71
BichengYing
closed
3 years ago
0
Optimizer num_step_per_communication behavior change and test
#70
hanbinhu
closed
3 years ago
0
Interactive bluefog
#69
kunyuan827
closed
3 years ago
0
Next