Closed kimin-yun closed 1 year ago
Thank you for updating the code. If I understand correctly, the training code in the current Readme is for performing Semantic Segmentation training with Mask2Former, not the method proposed in your paper. This would be the way to generate the CityScapes Inlier Training model in the RbA Model Zoo.
It seems that this model is the basis for further training according to the method proposed in your paper. Looking at the checkpoints,
model_logs/mask2former_dec_layers_2_res5_only/model_final.pth
This appears to be the result model obtained from the CityScapes Inlier Training, which should be used.
Also, the modified code still gives a NaN error. Upon further inspection, it seems to be an issue with the following conditional statement. However, it seems that the training in Mask2Former is carried out without this:
with torch.autograd.set_detect_anomaly(True):
If there's anything I've misunderstood, I'd appreciate it if you could correct me. I'm also planning to try training with the modules proposed in your paper.
Hi @kimin-yun, thank you for your interest in our work.
These issues you have encountered result from the code cleanup and the renaming of model names for clarity, therefore we apologize for encountering them. We have applied the following changes to address the problems.
with torch.autograd.set_detect_anomaly(True):
in train_net.py
as you suggested, it was used at some point for debugging purposes but no longer used in the main training code.As for the training code, the config files we provide are for both inlier training and RbA fine-tuning with OoD Data. The checkpoint model_logs/mask2former_dec_layers_2_res5_only/model_final.pth
is basically the same as ./ckpts/swin_b_1dl/model_final.pth
, we simply renamed it for more clarity but it remained in some of the config files, we updated all of the config files to the newly named path accordingly. ./ckpts/swin_l_1dl/config.yaml
as also used for inlier training but with the Swin L
backbone.
For outlier fine-tuning with RbA, you can use the following configs which should work after fixing the nan issue:
./ckpts/swin_b_1dl_rba_ood_coco/config.yaml
./ckpts/swin_b_1dl_rba_ood_map_coco/config.yaml
./ckpts/swin_l_1dl_rba_ood_map_coco/config.yaml
Thank you again for your interest and please do not hesitate to communicate any issues you encounter we will try to address them asap.
Thank you for your helpful responses and updates.
Firstly, I want to congratulate you on the presentation of your paper at ICCV. I found the paper quite interesting and I'm trying to understand the algorithm using the provided code. The evaluation seems to work without any issues, but I'm encountering errors with the training code.
I tried running the following code, as suggested:
And I got the following error:
Looking at the original Mask2Former code, it seems that there is a line in the
mask_former_semantic_dataset_mapper.py
file around line 183:instances.gt_classes = torch.tensor(classes, dtype=torch.int64)
which seems to be missing in the version I am using and seems to be causing the error. After adding it back, I got a new error, a RunTime Error due to a
nan
value, preventing the training from proceeding:From what I understood from the paper, the Cityscapes Inlier Training with Mask2Former config is a baseline trained on Cityscapes, and the RbA + COCO Outlier Supervision config is the method proposed for performance improvement.
When trying to run only the RbA + COCO Outlier Supervision training config, I get an error about missing initial weights at
WEIGHTS: model_logs/mask2former_dec_layers_2_res5_only/model_final.pth
. I tried using themodel_final
from the Cityscapes Inlier Training, but I still get an error during training.While I recognize that this could be an issue with the way I have set up Mask2Former, and I plan to try training Mask2Former again, I would appreciate any help you could provide in resolving this issue or making the training code operational.