Closed CharlesNJ closed 3 years ago
I think this has something to do with DistributedDataParallel or DataParallel but not sure what the problem is. Would appreciate if you can direct me where to look for the problem. Thanks!
Could you provide more details about what you modified to the code and config?
@RangiLyu Thanks for responding! I have been making some silly mistakes in using the wrong config file. Once I have that sorted out, if the error still persists, I will reopen the issue!
Hello @CharlesNJ , how did you solve it? I'm pretty sure I used the correct config file.
Hello @CharlesNJ , how did you solve it? I'm pretty sure I used the correct config file.
Yeah, sorry I should have documented it better, but I think I was just the wrong file usage for me. Could you post your error with tb
here?
Or even better just open a new issue and tag me, I will see if I can recognize any error I've faced.
This can be resolved by changing SyncBN with BN in model.roi_head.bbox_head (and also everyhere else) Or alternatively increasing the number of GPUs available It is because SyncBN does not work in a single GPU setup (https://github.com/pytorch/pytorch/issues/63662)
Describe the bug I think this is something related to PyTorch, I am trying to understand what is the problem. I am just trying to run a pre-trained model to transfer the weights to train a custom model.
Reproduction
What command or script did you run?
python tools/train.py 'configs/truck/cascade_mask_rcnn_swin_small_patch4_window7_mstrain_480-800_giou_4conv1f_adamw_3x_truck.py'
Did you make any modifications on the code or config? Did you understand what you have modified? Yes, but does not seem to be the problem in a config file.
What dataset did you use? Custom dataset with annotation for detection
Environment
I don't think this is env problem
Error traceback