Bluefog-Lib / bluefog

Distributed and decentralized training framework for PyTorch over graph
https://bluefog-lib.github.io/bluefog/
Apache License 2.0
291 stars 71 forks source link

NCCL an illegal memory access was encountered when running with 244*244*3 size dataset #44

Closed Bluefog-Lib closed 4 years ago

BichengYing commented 4 years ago

It is solved by pre-allocate the memory in python side first. It should be the problem for allocating memory for neighbor_allgather. However, the root reason is still unclear.

BichengYing commented 4 years ago

It is related to the ready signal of input data. It should be resolved now.