Possibly include ASAM to help with the generalization and learning. IE instead of SGD -> SOC(Adam), use ASAM -> SGD -> SOC(Adam). Will slow down learning, but may help with avoiding local lows/overfitting. Of course experimenting with different backbones/preprocessing inputs is there as well.
Great work! Very well done :)
Possibly include ASAM to help with the generalization and learning. IE instead of SGD -> SOC(Adam), use ASAM -> SGD -> SOC(Adam). Will slow down learning, but may help with avoiding local lows/overfitting. Of course experimenting with different backbones/preprocessing inputs is there as well.