Closed rizzoligiulia closed 1 month ago
Hi, this seems like it's happening from one of our options here: https://github.com/KU-CVLAB/CAT-Seg/blob/6d3a188af95165147fe2f34a8237fa7d2633e784/cat_seg/modeling/transformer/model.py#L596 The pad_len is set to 256 by default and thresholds top 256 classes, which saves memory and time during inference. If you want to run the model with all of the classes, you can set pad_len = 0, which disables this feature.
Hello, thank you for your answer.
The error I am facing regards the ground truth itself, i.e., the "sem_seg" in each batched_inputs. Even for A-847 and P-459, it seems to be an issue.
Hi, we've check on our environment and this issue doesn't occur for our setting. We've had reports where the detectron2 library was the problem, so re-installing detectron might help. The dataset preprocessing might also be of an issue, so checking if you have properly followed the dataset preparations, since the prepare_dataset.py files process the annotation files.
The thing is:
What would be the correct detectron2 version to use?
Hi,
We've gave a deeper look into detectron2 and the problem seems to be in the default dataloader. This is what we've figured out so far:
We're not sure what you want to do with the GT label during inference, but this also seems to happen with the newest version, so the only solution seems to be modifying the mapper inside the test dataloader instead of the default mapper. The problematic part is https://github.com/facebookresearch/detectron2/blob/5b72c27ae39f99db75d43f18fd1312e1ea934e60/detectron2/data/dataset_mapper.py#L159
The "L" option in read_image converts it into uint8, which cuts off all indexes above 255. Changing this line similar to what we have in the training loader like https://github.com/KU-CVLAB/CAT-Seg/blob/6d3a188af95165147fe2f34a8237fa7d2633e784/cat_seg/data/dataset_mappers/mask_former_semantic_dataset_mapper.py#L114 should do the trick.
Let me know if this helped!
When using the CatSeg model with a dataset that has more than 255 classes, the current implementation appears to be cutting off all classes above 255 and setting them to the 255th class. This is likely due to limitations in the underlying Detectron2 library, which CatSeg is built upon.