Open eyalbetzalel opened 1 year ago
Hi @eyalbetzalel
How many times have you trained and what model did you use?
Best,
Shiyi
@voidrank Hi, when I use my own dataset to train the MAL,the result is: val/mIoU_small: 0.4333444 val/mIoU_medium: 0.523455 val/mIoU_large: nan but when I try to generate the pesudo label,the whole results is wrong,the detailed situation is as follows: Validating: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11390/11390 [12:00<00:00, 12.46it/s]val/mIoU: nan val/mIoU_small: nan val/mIoU_medium: nan val/mIoU_large: nan
I don't know what caused this issue?
@voidrank Hi, when I use my own dataset to train the MAL,the result is: val/mIoU_small: 0.4333444 val/mIoU_medium: 0.523455 val/mIoU_large: nan but when I try to generate the pesudo label,the whole results is wrong,the detailed situation is as follows: Validating: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11390/11390 [12:00<00:00, 12.46it/s]val/mIoU: nan val/mIoU_small: nan val/mIoU_medium: nan val/mIoU_large: nan
I don't know what caused this issue?
I found a bug in their optimizer implementation. after switching it to SGD with momentum the problem had been solved.
` def configure_optimizers(self):
# betas=self.args.optim_betas,
# lr=self._lr, weight_decay=self._wd)
optimizer = torch.optim.SGD(self.parameters(), lr=self._lr, momentum=0.9)
return optimizer
`
g in their optimizer implementation. after switching it to SGD with moment
Hi, sorry I haven't responded. I missed your massage. The problem, as mentioned in the comment above was in the optimizer.
Hi @eyalbetzalel , You will get NaN scores if you don't provide a label for specific categories.
Hi,
When I train the network from the begining it works fine but when I resume the training from the checkpoint file (ViT for COCO from epoch 6 that is posted here) I get this issue:
I tried to decrease LR and it didn't help.
any ideas?