Audio-AGI / AudioSep

Official implementation of "Separate Anything You Describe"
https://audio-agi.github.io/Separate-Anything-You-Describe/
MIT License
1.63k stars 118 forks source link

pytorch_lightning, DDP, GPU stucked at 100%, training stopped #29

Open anonymoussky opened 1 year ago

anonymoussky commented 1 year ago

Do you encounter this issue? Any suggestions? Training after one epoch, somewhere in the middle of the 2nd epoch training, all GPU stucked at 100% without error. Training is also stucked. It seems like a common bug of pytorch_lightning using DDP. But I still did not find a solution.

https://github.com/Lightning-AI/lightning/issues/11242