keras-team / keras-cv

Industry-strength Computer Vision workflows with Keras
Other
1.01k stars 330 forks source link

VGG 16 Overfitting Issues #1662

Closed NiharJani2002 closed 1 year ago

NiharJani2002 commented 1 year ago

Short Description

One major issue is that the model will perform poorly on new, unseen data, even if trained well on the existing data, because of overfitting.

Papers https://arxiv.org/abs/1409.1556

Existing Implementations

https://github.com/keras-team/keras-cv/blob/master/keras_cv/models/vgg16.py

Other Information This can lead to incorrect predictions and unreliable results. Overfitting can also make the model overly complex, making it difficult to interpret and explain its decision-making process. This can be a significant obstacle in scenarios where transparency and interpretability are necessary, such as legal or regulatory contexts. Therefore, it is essential to address overfitting in VGG16 to ensure reliable and accurate results in real-world applications.

@LukeWood @ianstenbit? can you assign me this issue so that I can work on it?

jbischof commented 1 year ago

@NiharJani2002 can you clarify what you're proposing? Overfitting is a potential issue for any model. Some solutions are architecture specific (e.g., dropout), others are training time specific (e.g., early stopping), but VGG16 is a fixed architecture.

NiharJani2002 commented 1 year ago

My suggestion to improve the performance of the VGG16 architecture is to incorporate additional layers to the specific task at hand. I recommend adding a GlobalAveragePooling2D layer, reducing the dimensions of the feature maps obtained from the convolutional layers. This layer will produce a fixed-length vector for each image by averaging the feature maps across the spatial dimensions. I suggest adding a Dropout layer that randomly drops out some neurons during training to prevent overfitting. Finally, a Dense layer should be added to map the output of the GlobalAveragePooling2D layer to the number of classes in the classification task. This layer should apply a SoftMax activation function to generate the final class probabilities. @jbischof

jbischof commented 1 year ago

@NiharJani2002 thanks for the suggestion but if you make architectural changes it would no longer be a VGG16 model and no longer compatible with our pretrained checkpoints. Our mission is to replicate popular architectures rather than innovate in the space. If you would like to contribute to KerasCV I recommend either picking up one of our many "contributors welcome" issues or filing a feature request aligned with our roadmap.

NiharJani2002 commented 1 year ago

Thanks for the clarification, @jbischof, for addressing which type of issues KERAS-CV wants to solve. If you find any issues which need contribution, please feel free to let me know by assigning those issues to me.

@jbischof