Qualcomm-AI-research / InverseForm

BSD 3-Clause Clear License
147 stars 21 forks source link

Training the Inverseform Net on custom dataset #7

Open yeshwanth95 opened 3 years ago

yeshwanth95 commented 3 years ago

Hi! Based on your paper, I understand that you have retrained the IFLoss separately for each dataset (as specified in your supplementary materials). I am presently using a custom aerial imagery dataset for my experiments and would like to follow the same protocol of retraining IFLoss on this dataset. I hope you can offer some insights on how I could go about doing this. Even better, I would appreciate it if you could share the training script, which would save a great deal of time for me.

Also, I'm also curious as to why you have retrained the IFLoss for each dataset separately. Since the Inverseform network only used the GT seg masks to train, are the structure and shape of the GT seg masks so drastically different between the various datasets that IFLoss requires retraining on each dataset separately?

Looking forward to your reply. Thank you!

sborse3 commented 3 years ago

Hi @yeshwanth95, thanks for your questions! We are not able to share the training script, but I'm happy to answer any specific questions.

Initially, we trained a single IF module on ImageNet and used it for all datasets. This method yielded similar results compared to the current method. We stuck with the current strategy(training on same dataset) to highlight that improvements do not come from auxiliary datasets.

yeshwanth95 commented 3 years ago

Hi @sborse3 , thanks for your reply. Just a small clarification, you say you trained the IF module on ImageNet. I was under the assumption that GT seg masks are needed to train the IF module. If I'm not wrong, ImageNet does not offer GT masks. So could you clarify how you trained the IF module on this dataset?

Also, if I wish to train the IF module on my own dataset, I understand I need an STN to generate the transformed GT masks. Could you kindly share the details/pretrained weights of the STN you used in your pipeline, so that I can generate the transformed GT masks myself.

Finally, going by your answer, can I safely assume that the dataset on which the IF loss is trained, does not affect its performance in estimating the homography between edge masks of different datasets when plugged into a segmentation network?

Kindly excuse the long message. Looking forward to your reply. Thank you.

sborse3 commented 3 years ago
  1. GT seg masks are way easier to extract edges due to their categorical nature, but we can extract edges straight from the input image as well.
  2. STN: https://pytorch.org/tutorials/intermediate/spatial_transformer_tutorial.html
  3. That's correct based on our experiments and observations!
yeshwanth95 commented 3 years ago

Hi @sborse3 , thanks for your reply. I have been trying to generate the transformed GT masks using an STN, which are required to train the InverseForm net.

I am still not sure how to replicate the procedure you followed. Could you kindly specify what architecture/dataset you used to train the STN? The tutorial you shared uses the MNIST dataset. Was the pretrained STN (specified in the supplementary material) also trained using the MNIST dataset?

Also, if my assumption is correct, the inverseform net is trained on the affine matrix values from the pretrained STN. What loss function do you use between the affine matrix values from the STN and the Inverseform net?

Looking forward to your reply. Thank you.

erfan9819 commented 1 year ago

Hi! Based on your paper, I understand that you have retrained the IFLoss separately for each dataset (as specified in your supplementary materials). I am presently using a custom aerial imagery dataset for my experiments and would like to follow the same protocol of retraining IFLoss on this dataset. I hope you can offer some insights on how I could go about doing this. Even better, I would appreciate it if you could share the training script, which would save a great deal of time for me.

Also, I'm also curious as to why you have retrained the IFLoss for each dataset separately. Since the Inverseform network only used the GT seg masks to train, are the structure and shape of the GT seg masks so drastically different between the various datasets that IFLoss requires retraining on each dataset separately?

Looking forward to your reply. Thank you!

I am confused about the training of a InversrForm network. Are the inputs of this network the boundary maps of prediction and GT? You wrote on GitHub that only ground truth data is used to train the network, and I was confused as to what is the role of the predicted boundary image? It is written in the article that we first train the inverse transformation network using boundary maps of images sampled from the target dataset. We use STN to generate transformed versions of boundary images. Then, we use these images and their associated transforms as input to the inverse transform network. What is meant by transformed versions of boundary images? Thank you for guiding me with your answer and I am impatiently waiting to receive your message.

Billy-ZTB commented 3 weeks ago

Hi @sborse3 , thanks for your reply. I have been trying to generate the transformed GT masks using an STN, which are required to train the InverseForm net.

I am still not sure how to replicate the procedure you followed. Could you kindly specify what architecture/dataset you used to train the STN? The tutorial you shared uses the MNIST dataset. Was the pretrained STN (specified in the supplementary material) also trained using the MNIST dataset?

Also, if my assumption is correct, the inverseform net is trained on the affine matrix values from the pretrained STN. What loss function do you use between the affine matrix values from the STN and the Inverseform net?

Looking forward to your reply. Thank you.

Hello! I also want to use this method in my aerial image dataset, may I ask did you train the IF module in your mission? How did you train it?