fuzailpalnak / building-footprint-segmentation

Building footprint segmentation from satellite and aerial imagery
https://fuzailpalnak-buildingextraction-appbuilding-extraction-s-ov1rp9.streamlitapp.com/
Apache License 2.0
128 stars 32 forks source link

Best Model info #42

Closed teresalisanti closed 1 year ago

teresalisanti commented 2 years ago

Hi Fuzail, could you please share some information about the best model you got? I would like to know:

Thank you,

Teresa

fuzailpalnak commented 2 years ago

Information on which model are you interested in out of the two?

  1. RefineNet trained on INRIA
  2. DlinkNet trained on Massachusetts Buildings Dataset
teresalisanti commented 2 years ago

Both if you can otherwise RefineNet is enough

fuzailpalnak commented 2 years ago

RefineNet

  1. Training

    • Training was carried out on 384x384 images for around 120 to 130 epochs(I can't remember the exact number)

    • For Augmentation, I used combination of color as well as geometric combination. As far as I can recall, I used color augmentation with very low probability i.e they were used not that often in the data augmentation stage, however, I used a lot of geometric augmentations, [random rotate, vertical flip, horizontal flip, crop, resize] One additional augmentation that I explicitly used was cropping the input image to size [224, 256, 288] and then rescaling it to 384x384

    • I applied minmax normalization on the images followed by standard Imagenet normalization.

    • Used Adam as optimizer with lr=1e-04 and kept the rest of the configurations same. I used L2 regularizer to tackle overfitting, however, I relied on data augmentation majority of the time to handle overfitting. For loss I used combination of Jaccard and Binary cross entropy, with alpha=0.3, where alpha is the weight for jaccard.

    • During training I used precision, recall and jaccard as meteric to monitor the progress.

  2. Prediction

    • For prediction I aggregated prediction form mutiple geometric augmentations.
  3. Metric

DlinkNet

teresalisanti commented 1 year ago

Hi, the model (RefineNet) you published on github, is the best one you trained with the hyperparameters above? I am a bit confused, it looks like it doesn't perform that well on our aerial images.

fuzailpalnak commented 1 year ago

Yes, its the best model. It could be because I also used test time augmentation while inference

fuzailpalnak commented 1 year ago

@teresalisanti are you running the refine-net model on a custom aerial imagery data ? or Inria data ? And the results are they from finetuned refine-net model ? or just the model weights that are shared in the repo ?

teresalisanti commented 1 year ago

I trained from scratch the RefineNet model on custom aerial imagery data + Inria dataset, so i didn't use the model weights that you shared in the repository. I tested my best model configuration on my own images without test time augmentation. Why do you use test time augmentation while inference? I don't get the benefit. Why don't you share the code with hyperparameters you copied above so that we can train our model with that configuration?

fuzailpalnak commented 1 year ago

unfortunately, I don't have the script that I used for training, it was in my old laptop which I no longer have and top of that, I did not commit the file to git :( .

TTA was helpful with IOU as in case of building, larger buildings are cut off in to different images, scaling during prediction helps with that problem, however, its not that significant though. Rotation helps to increase the confidence of prediction, its just to reduce pixels which don't have high confidence (false positives). To handle the boundary effect, Building's at the boundary of the image, might obtain low confidence, to tackle this Mirroring, Cropping and Scaling is done as to ensure buildings at the boundaries are detected

What I can suggest you do is, try fine tuning the model, if the image quality differ a lot then just use the weights for the earlier layers. If you plan to train the model from scratch than use the Imagenet weights for ResNet.

teresalisanti commented 1 year ago

Thank you!! Did You train your model from scratch or just the top layers of the network by applying transfer learning with Imagenet weights? How can i modify the scripts (refinenet.py, segmentation.py) to freeze all but the top layers which i would like to retrain using the weights you shared in this repo? Thank you :)

fuzailpalnak commented 1 year ago

I used ImageNet weights to initialise ResNet module and used default pytorch initialisation for rest, thats how I setup the training.

In the library its default to use ImageNet weights for training, i.e it does what I described above.

However, the library doesn’t have a functionality to split pre-trained weights which are not present in torchvision and make them non-trainable. You will have to write a custom function to set defined layers to non trainable after you have loaded the model.

What I would suggest you rather do is, load the entire weights file (file shared in the repo) and finetune it for your set of images. You can start by setting lr=1e-04 and see how the training progress and make changes accordingly.

You can set this param == False, that way just the weights in the decoder part will be updated.

teresalisanti commented 1 year ago

Thank you! I am training the model using your weights for resnet. Why do i always get higher metrics on validation set?

image

This is really strange and it happens for every epoch

fuzailpalnak commented 1 year ago

One common reason for this is, when augmentation is applied the model gets some hard examples to learn from which causes the validation metric to be lower than training metric, eventually the model should be able to adapt to those hard examples and generate a more standard metric

if the issue still persists than you should either make the model more complex or reduce augmentation

teresalisanti commented 1 year ago

if metrics on validation set are lower than those on training set, than everything is working fine. This should be the normal output of every training, as you said above. In our case metrics on validation set are always higher than those on training set.

fuzailpalnak commented 1 year ago

If the validation metric is lower than training metric for initial epochs then thats not a problem, however, if the train metric is always higher than validation throughout the training than it could be considered as a problem as this behaviour is not desired.

To avoid such behaviour, the model performance on training data without augmentation could be checked to verify that model is not under-fitting. If thats not the case then model complexity could be increased so the model is able to adapt the varying examples in the training set. One other sanity check would be to train on model with high complexity (perhaps Resent150 or greater) and observe the behaviour.