F1 Score Curve During Meta Train/Validation

Hi, I am training VGG11 on a custom image dataset for 3-way 5-shot image classification using MAML. I am encapsulating the whole VGG11 model with MAML, i.e., not just the classification head. My hyperparameters are as follows:

Meta LR: 0.001
Fast LR: 0.5
Adaptation steps: 1
First order: False
Meta Batch Size: 5
Optimizer: AdamW

During the training, I noticed that after taking the first outer-loop optimization step, i.e., AdamW.step(), loss skyrockets to very large values, like ten thousands. Is this normal? Also, I am measuring the micro F1 score as accuracy metric of which curve for meta training/validation is as follows:

It is fluctuating too much in my opinion, is this normal?

Thanks.

learnables / learn2learn

F1 Score Curve During Meta Train/Validation #377