I attempted to train the rotated_rtmdet_l-coco_pretrain-3x-x model on a new dataset by adjusting the number of classes in the settings. However, I observed that the bounding box loss (bbloss) is always zero. Upon debugging, I found that my model predicts all classes as background, which are then discarded, resulting in a loss of zero.
Steps to Reproduce:
Load the rotated_rtmdet_l-coco_pretrain-3x-x model.
Modify the number of classes in the configuration to match the new dataset.
Start training the model on the new dataset.
Observe the bounding box loss during training.
Observed Behavior:
The bounding box loss remains at zero throughout the training.
The model predicts all classes as background, leading to a loss of zero.
Expected Behavior:
The model should correctly predict the classes in the new dataset, and the bounding box loss should reflect the predictions and ground truth differences.
Hyperparameters Tried:
Different learning rates
Various batch sizes
Adjusted weight decay and momentum parameters
Additional Information:
I am confident that I am using the pre-trained weights from the COCO dataset.
The class annotations in the new dataset have been verified to be correct.
Possible Causes Considered:
Incorrect number of classes specified in the configuration
Pre-trained weights not being properly loaded
Data augmentation or preprocessing issues
Prerequisite
Task
I have modified the scripts/configs, or I'm working on my own tasks/models/datasets.
Branch
1.x branch https://github.com/open-mmlab/mmrotate/tree/1.x
Environment
sys.platform: win32 Python: 3.8.19 (default, Mar 20 2024, 19:55:45) [MSC v.1916 64 bit (AMD64)] CUDA available: True MUSA available: False numpy_random_seed: 2147483648 GPU 0: NVIDIA GeForce RTX 2080 Ti CUDA_HOME: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8 NVCC: Cuda compilation tools, release 11.8, V11.8.89 MSVC: Microsoft (R) C/C++ Optimizing Compiler versione 19.39.33523 per x64 GCC: n/a PyTorch: 1.8.0+cu111 PyTorch compiling details: PyTorch built with:
TorchVision: 0.9.0+cu111 OpenCV: 4.10.0 MMEngine: 0.10.4 MMRotate: 1.0.0rc1+1dc8d77
Reproduces the problem - code sample
I attempted to train the rotated_rtmdet_l-coco_pretrain-3x-x model on a new dataset by adjusting the number of classes in the settings. However, I observed that the bounding box loss (bbloss) is always zero. Upon debugging, I found that my model predicts all classes as background, which are then discarded, resulting in a loss of zero.
Steps to Reproduce:
Load the rotated_rtmdet_l-coco_pretrain-3x-x model. Modify the number of classes in the configuration to match the new dataset. Start training the model on the new dataset. Observe the bounding box loss during training. Observed Behavior:
The bounding box loss remains at zero throughout the training. The model predicts all classes as background, leading to a loss of zero. Expected Behavior:
The model should correctly predict the classes in the new dataset, and the bounding box loss should reflect the predictions and ground truth differences. Hyperparameters Tried:
Different learning rates Various batch sizes Adjusted weight decay and momentum parameters Additional Information:
I am confident that I am using the pre-trained weights from the COCO dataset. The class annotations in the new dataset have been verified to be correct. Possible Causes Considered:
Incorrect number of classes specified in the configuration Pre-trained weights not being properly loaded Data augmentation or preprocessing issues
Reproduces the problem - command or script
python tools/train.py --config configs/rotated_rtmdet/rotated_rtmdet_l-coco_pretrain-3x-dataset3.py
Reproduces the problem - error message
bbloss == 0
Additional information
No response