Selection of hyperparameters

Hi, I'd like to know whether all the hyperparameters in this paragraph from your paper were chosen with a grid search.

During the U-Net training process, the Adam algorithm was employed to minimize the Cross-Entropy loss with an initial learning rate of 2x10-4. Moreover, we utilized early stopping based on the loss of the validation set and trained for 44 epochs. After the 40th epoch, the learning rate was reduced to 2x10-5. The selected batch size was 5 samples. We also employed random rotations of the input images by -90˚, 0˚, 90˚, or 180˚ and horizontal flips in order to augment the dataset. The selection of the hyperparameters above and training set-up was based on grid search in the validation set

Could you tell me any other batch sizes and learning rates that you used? I want to experiment with larger values for these and wanted to know if you have done that already and what your results were. Also, did you try using SGD with momentum, another learning rate scheduler, or additional augmentations?

Basically, I want to know a bit more details about your hyperparameter search. Thank you.

marine-debris / marine-debris.github.io

Selection of hyperparameters #7