OceanPang / Libra_R-CNN

Code for CVPR 2019 paper "Libra R-CNN: Towards Balanced Learning for Object Detection"
Apache License 2.0
367 stars 49 forks source link

The model and loaded state dict do not match exactly #14

Closed SallyYingSong closed 5 years ago

SallyYingSong commented 5 years ago

Hi, I kept meeting this problem and warning while starting training, whatever dataset(coco or my own dataset) i used. I successfully trained with my own dataset before but now i does not work. Do you have any idea of the error and warning?

ERROR: subprocess.CalledProcessError: Command '['/home/.conda/envs/open-mmlab/bin/python', '-u', './tools/train.py', '--local_rank=0', './configs/libra_rcnn/libra_faster_rcnn_r101_fpn_1x.py', '--launcher', 'pytorch']' died with <Signals.SIGSEGV: 11>. WARNING: The model and loaded state dict do not match exactly bug.txt

OceanPang commented 5 years ago

The mismatch warning is okay because the fc is used for ImageNet classification. For the error, I recommend you checking the install.md / getting_started.md carefully to set your env. Be sure that distributed training only can be used with multiple gpus.

SallyYingSong commented 5 years ago

Thank you for the quick reply. I followed the install.md to reset the env. But when i ran torch.cuda.is_available() to verify pytorch according to https://pytorch.org/get-started/locally/, the terminal shows "FALSE". I loaded nvidia/cuda/10.0 and installed pytorch torchvision cudatoolkit10.0, but it is still "FALSE".

libra R-CNN pytorch_verification_FALSE02

SallyYingSong commented 5 years ago

I tried it on the computer and it says "TRUE" but when i tried it on the super computer center it says "FALSE". Maybe that is the reason why it does not say “TRUE” on the super computer center :)