Closed leexinhao closed 1 year ago
@leexinhao Hi, I didn't have this problem during training. Could you please let me know your environments and training scripts?
@leexinhao Hi, I didn't have this problem during training. Could you please let me know your environments and training scripts?
I have reproduced this problem in torch1.7.0 and torch1.12.0, and I use the slurm_train.sh
, it also mentioned by mmaction2 documention (adapt-image-models\docs\faq.md
:
I am curious that why didn't you encounter this problem.
I am using dist_train.sh as shown in https://github.com/taoyang1122/adapt-image-models/blob/main/run_exp.sh. Will you have the problem using this?
I am using dist_train.sh as shown in https://github.com/taoyang1122/adapt-image-models/blob/main/run_exp.sh. Will you have the problem using this?
It also happens, maybe it's caused by my environment of mmaction2, it's not a big deal, I have reproduced your results. Thanks for your excellent work!
Such as the title, otherwise you will encounter errors in DDP training.