Closed Jogima-cyber closed 3 years ago
Btw for my current configuration I'm using this LR reducer with adam optimizer, LR set to 0.001, and regular tf training :
lr_reducer = tf.keras.callbacks.ReduceLROnPlateau(
monitor="val_classification_loss", patience=3, min_lr=1e-6, mode='min')
SGD + momentum
works better than adam
on finetuning. You can also check SGDW
/ AdamW
from tensorflow-addons. It also seems better using tfa.optimizers.Lookahead
when finetuning.tf.keras.experimental.CosineDecayRestarts
for SGD
, also starts from initial_learning_rate=1e-3
for finetuning.StochasticDepth
also comes from tensorflow-addons
may also help:
model = efficientnet_v2.EfficientNetV2L(input_shape=(None, None, 3), survivals=(1, 0.8), dropout=1e-6, classes=1000)
Dropout
or other regularization
, I hadn't tested on them...Thank you very much for your insights!
Do you have an idea for the best configuration for finetuning ? Which optimizer, which LR reduction strategy, need for dropout, regularization, need for weird learning technics?