krumo / Domain-Adaptive-Faster-RCNN-PyTorch

Domain Adaptive Faster R-CNN in PyTorch
MIT License
305 stars 68 forks source link

Why I have to provide the ann_file of target domain dataset? #24

Closed Icecream-blue-sky closed 3 years ago

Icecream-blue-sky commented 3 years ago

Original domain adaptive faster rcnn is unsupervised for target domain, which means that there isn't any label of target domain dataset. But in paths_catalog.py: it provides the ann_file for foggycityscapes...._train_cocostyle, why? Doesn't it mean that the model already see the foggy_cityscapes image in the training process?How can I train the model with unlabeled target domain dataset? Thanks!

krumo commented 3 years ago

In a typical unsupervised domain adaptation setting, you have to train the model with labeled source data and unlabeled target data, which means your model will see some target images without knowing its category information. The reason why you need to feed a .json file for target domain to this codebase is that it requires you to pass a file containing the names of target image and some meta information(e.g. image size) for training and testing. For target domain training image you don't have to provide a real one. I tried filling the annotation part with a pseudo one like a bbox as large as the whole image and it works. We need to provide the annotation part for the target image because this code is based on maskrcnn-benchmark, which is designed for supervised object detection and requires each input image should have an annotation part. I didn't intend to make a large change to the original codebase. However, if too many people believe this is confusing, I would consider to rewrite it.

Icecream-blue-sky commented 3 years ago

In a typical unsupervised domain adaptation setting, you have to train the model with labeled source data and unlabeled target data, which means your model will see some target images without knowing its category information. The reason why you need to feed a .json file for target domain to this codebase is that it requires you to pass a file containing the names of target image and some meta information(e.g. image size) for training and testing. For target domain training image you don't have to provide a real one. I tried filling the annotation part with a pseudo one like a bbox as large as the whole image and it works. We need to provide the annotation part for the target image because this code is based on maskrcnn-benchmark, which is designed for supervised object detection and requires each input image should have an annotation part. I didn't intend to make a large change to the original codebase. However, if too many people believe this is confusing, I would consider to rewrite it.

Thanks for replying. But I still have two questions: 1、Does it mean I have to create annotation.json with pseudo labels rather than empty annotaion.json? 2、What will happen if I provide really anno.json with real labels for target domain datasets? Will the model use these labels of target domain dataset to caculate loss? I need to confirm this, because I have tried to train the model on a new dataset(RTTS), and get extremely bad results (I mean the loss_rpn_box_reg and loss_da_instance don't converge)as you can see in the picture. I don't know what the casuse is, probably the model itself? loss_curve

krumo commented 3 years ago

In a typical unsupervised domain adaptation setting, you have to train the model with labeled source data and unlabeled target data, which means your model will see some target images without knowing its category information. The reason why you need to feed a .json file for target domain to this codebase is that it requires you to pass a file containing the names of target image and some meta information(e.g. image size) for training and testing. For target domain training image you don't have to provide a real one. I tried filling the annotation part with a pseudo one like a bbox as large as the whole image and it works. We need to provide the annotation part for the target image because this code is based on maskrcnn-benchmark, which is designed for supervised object detection and requires each input image should have an annotation part. I didn't intend to make a large change to the original codebase. However, if too many people believe this is confusing, I would consider to rewrite it.

Thanks for replying. But I still have two questions: 1、Does it mean I have to create annotation.json with pseudo labels rather than empty annotaion.json? 2、What will happen if I provide really anno.json with real labels for target domain datasets? Will the model use these labels of target domain dataset to caculate loss? I need to confirm this, because I have tried to train the model on a new dataset(RTTS), and get extremely bad results (I mean the loss_rpn_box_reg and loss_da_instance don't converge)as you can see in the picture. I don't know what the casuse is, probably the model itself? loss_curve

  1. If I remember correctly, the answer is yes. As maskrcnn-benchmark would filter out all images without annotations during training at here, it is necessary to provide some pseudo annotations.
  2. It won't influence your adaptation performance if you provide real annotation for target images. The model would not use the labels of target domain dataset to calculate loss. The adaptation performance would depend on the domain gap between your source dataset and target dataset. Please check whether your adapted model performs better than the unadapted one.

    As we expect our image-level feature and instance-level feature could confuse the corresponding domain discriminator, it is expected to see a relatively large loss_da_image and loss_da_instance, which, if my memory is correct, should be around 0.69. It is abnormal for your loss_rpn_box_reg not to converge. I would suggest you adjust the weights for loss_da_image and loss_da_instance. Plus, considering adversarial training is quite unstable, you might require to change the training iterations for different tasks. It would be best to track the performance changes on test set during training so that you could figure out which part is wrong soon.

Icecream-blue-sky commented 3 years ago

In a typical unsupervised domain adaptation setting, you have to train the model with labeled source data and unlabeled target data, which means your model will see some target images without knowing its category information. The reason why you need to feed a .json file for target domain to this codebase is that it requires you to pass a file containing the names of target image and some meta information(e.g. image size) for training and testing. For target domain training image you don't have to provide a real one. I tried filling the annotation part with a pseudo one like a bbox as large as the whole image and it works. We need to provide the annotation part for the target image because this code is based on maskrcnn-benchmark, which is designed for supervised object detection and requires each input image should have an annotation part. I didn't intend to make a large change to the original codebase. However, if too many people believe this is confusing, I would consider to rewrite it.

Thanks for replying. But I still have two questions: 1、Does it mean I have to create annotation.json with pseudo labels rather than empty annotaion.json? 2、What will happen if I provide really anno.json with real labels for target domain datasets? Will the model use these labels of target domain dataset to caculate loss? I need to confirm this, because I have tried to train the model on a new dataset(RTTS), and get extremely bad results (I mean the loss_rpn_box_reg and loss_da_instance don't converge)as you can see in the picture. I don't know what the casuse is, probably the model itself? loss_curve

  1. If I remember correctly, the answer is yes. As maskrcnn-benchmark would filter out all images without annotations during training at here, it is necessary to provide some pseudo annotations.
  2. It won't influence your adaptation performance if you provide real annotation for target images. The model would not use the labels of target domain dataset to calculate loss. The adaptation performance would depend on the domain gap between your source dataset and target dataset. Please check whether your adapted model performs better than the unadapted one. As we expect our image-level feature and instance-level feature could confuse the corresponding domain discriminator, it is expected to see a relatively large loss_da_image and loss_da_instance, which, if my memory is correct, should be around 0.69. It is abnormal for your loss_rpn_box_reg not to converge. I would suggest you adjust the weights for loss_da_image and loss_da_instance. Plus, considering adversarial training is quite unstable, you might require to change the training iterations for different tasks. It would be best to track the performance changes on test set during training so that you could figure out which part is wrong soon.

Thanks for such a detailed reply. I'll check it and try to figure out the cause. Have a nice day!

Fly-dream12 commented 3 years ago

Have you checked out and obtain a better result on the adapted model. The performance is even worse on my dataset when doing domain adaptation. Is there any tricks? @Icecream-blue-sky @krumo