PeterL1n / RobustVideoMatting

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!
https://peterl1n.github.io/RobustVideoMatting/
GNU General Public License v3.0
8.32k stars 1.11k forks source link

About Distributed Training #231

Open tayton42 opened 1 year ago

tayton42 commented 1 year ago

Thank you for your research.I have a question about single multi-card training, when my code starts to self.model_ddp = DDP(self.model, device_ids=[self.rank], broadcast_buffers=False, find_unused_parameters=True) Processes on other GPUs appear on GPU0, they have the same PID, this causes GPU0 memory overflow, I can't find the cause and solution, please help me.Thanks! image

DommyWorld commented 1 year ago

try to use torchrun.

tayton42 commented 1 year ago

try to use torchrun.

Thank you for your answer!But I am not familiar with torchrun.Can you tell me how I should modify the RVM code?thanks anyway!!

Stephen-K1 commented 8 months ago

Hi. I got the same problem. Have you find the solution to this problem yet? It would help me a great deal if you could share your experience here. Thank you!