Closed teresalisanti closed 1 year ago
Information on which model are you interested in out of the two?
Both if you can otherwise RefineNet is enough
Training
Training was carried out on 384x384 images for around 120 to 130 epochs(I can't remember the exact number)
For Augmentation, I used combination of color as well as geometric combination. As far as I can recall, I used color augmentation with very low probability i.e they were used not that often in the data augmentation stage, however, I used a lot of geometric augmentations, [random rotate, vertical flip, horizontal flip, crop, resize] One additional augmentation that I explicitly used was cropping the input image to size [224, 256, 288] and then rescaling it to 384x384
I applied minmax normalization on the images followed by standard Imagenet normalization.
Used Adam as optimizer with lr=1e-04 and kept the rest of the configurations same. I used L2 regularizer to tackle overfitting, however, I relied on data augmentation majority of the time to handle overfitting. For loss I used combination of Jaccard and Binary cross entropy, with alpha=0.3, where alpha is the weight for jaccard.
During training I used precision, recall and jaccard as meteric to monitor the progress.
Prediction
Metric
Hi, the model (RefineNet) you published on github, is the best one you trained with the hyperparameters above? I am a bit confused, it looks like it doesn't perform that well on our aerial images.
Yes, its the best model. It could be because I also used test time augmentation while inference
@teresalisanti are you running the refine-net model on a custom aerial imagery data ? or Inria data ? And the results are they from finetuned refine-net model ? or just the model weights that are shared in the repo ?
I trained from scratch the RefineNet model on custom aerial imagery data + Inria dataset, so i didn't use the model weights that you shared in the repository. I tested my best model configuration on my own images without test time augmentation. Why do you use test time augmentation while inference? I don't get the benefit. Why don't you share the code with hyperparameters you copied above so that we can train our model with that configuration?
unfortunately, I don't have the script that I used for training, it was in my old laptop which I no longer have and top of that, I did not commit the file to git :( .
TTA was helpful with IOU as in case of building, larger buildings are cut off in to different images, scaling during prediction helps with that problem, however, its not that significant though. Rotation helps to increase the confidence of prediction, its just to reduce pixels which don't have high confidence (false positives). To handle the boundary effect, Building's at the boundary of the image, might obtain low confidence, to tackle this Mirroring, Cropping and Scaling is done as to ensure buildings at the boundaries are detected
What I can suggest you do is, try fine tuning the model, if the image quality differ a lot then just use the weights for the earlier layers. If you plan to train the model from scratch than use the Imagenet weights for ResNet.
Thank you!! Did You train your model from scratch or just the top layers of the network by applying transfer learning with Imagenet weights? How can i modify the scripts (refinenet.py, segmentation.py) to freeze all but the top layers which i would like to retrain using the weights you shared in this repo? Thank you :)
I used ImageNet weights to initialise ResNet module and used default pytorch initialisation for rest, thats how I setup the training.
In the library its default to use ImageNet weights for training, i.e it does what I described above.
However, the library doesn’t have a functionality to split pre-trained weights which are not present in torchvision and make them non-trainable. You will have to write a custom function to set defined layers to non trainable after you have loaded the model.
What I would suggest you rather do is, load the entire weights file (file shared in the repo) and finetune it for your set of images. You can start by setting lr=1e-04 and see how the training progress and make changes accordingly.
You can set this param == False, that way just the weights in the decoder part will be updated.
Thank you! I am training the model using your weights for resnet. Why do i always get higher metrics on validation set?
This is really strange and it happens for every epoch
One common reason for this is, when augmentation is applied the model gets some hard examples to learn from which causes the validation metric to be lower than training metric, eventually the model should be able to adapt to those hard examples and generate a more standard metric
if the issue still persists than you should either make the model more complex or reduce augmentation
if metrics on validation set are lower than those on training set, than everything is working fine. This should be the normal output of every training, as you said above. In our case metrics on validation set are always higher than those on training set.
If the validation metric is lower than training metric for initial epochs then thats not a problem, however, if the train metric is always higher than validation throughout the training than it could be considered as a problem as this behaviour is not desired.
To avoid such behaviour, the model performance on training data without augmentation could be checked to verify that model is not under-fitting. If thats not the case then model complexity could be increased so the model is able to adapt the varying examples in the training set. One other sanity check would be to train on model with high complexity (perhaps Resent150 or greater) and observe the behaviour.
Hi Fuzail, could you please share some information about the best model you got? I would like to know:
Thank you,
Teresa