nan loss in train (but not when I use mobilenet backbone )

Hi,

I'm trying to train from SCRATCH on my own dataset ( 24 samples, 4 classes ) when I use the mnv2 backbone, the training "works" when I use others (resnet 50 , vnet) , i get nan loss on input from first iteration. any ideas ?

I tried copying the config files from config/centermask ( for example centermask_V_19_eSE_FPN_lite_res600_ms_bs16_4x.yaml) and changed the following INPUT: MIN_SIZE_RANGE_TRAIN: (720, 1280) MAX_SIZE_TRAIN: 1280 MIN_SIZE_TEST: 720 MAX_SIZE_TEST: 1280 PIXEL_MEAN : [0, 0, 0] # disabling this functionallity DATALOADER: SIZE_DIVISIBILITY: 32 NUM_WORKERS : 1 SOLVER: BASE_LR: 0.001 WEIGHT_DECAY: 0.0001 STEPS: (0,60000, 80000) MAX_ITER: 90000 IMS_PER_BATCH: 2 WARMUP_METHOD: "constant" CHECKPOINT_PERIOD : 1000 TEST_PERIOD : 1000 TEST: IMS_PER_BATCH : 1

also added NUM_CLASSES: 4 to fcos, RETINANET, ROI_BOX_HEAD. removed WEIGHT parameter.

youngwanLEE / CenterMask

nan loss in train (but not when I use mobilenet backbone ) #59