openai / supervised-reptile

Code for the paper "On First-Order Meta-Learning Algorithms"
https://arxiv.org/abs/1803.02999
MIT License
996 stars 214 forks source link

When using the pre-trained model for retraining, the accuracy declines. What is the reason and is it normal? #25

Closed yikeqingli closed 5 years ago

yikeqingli commented 5 years ago

`

args = argument_parser().parse_args()
random.seed(args.seed)

train_set, val_set, test_set = read_dataset(DATA_DIR)

model = MiniImageNetModel(args.classes, **model_kwargs(args))

eval_kwargs = evaluate_kwargs(args)

with tf.Session() as sess:
    if not args.pretrained:
        print('Training...')
        sess.run(tf.global_variables_initializer())
        saver_1 = tf.train.Saver(tf.trainable_variables())
        print(args.checkpoint)
        saver_1.restore(sess, tf.train.latest_checkpoint(args.checkpoint))
        print('Test accuracy: ' + str(evaluate(sess, model, test_set, **eval_kwargs)))
        graph = tf.get_default_graph()
        tf.summary.FileWriter('./logs',graph)
        train(sess, model.minimize_op,model, train_set, test_set, args.checkpoint, **train_kwargs(args))`

When using the model produced by reptile training, after the first round of training, the saved model was tested, and the accuracy dropped by 20 points, about 0.24 (Note: an evaluation process before retraining, verified that it has been loaded correctly model In fact, a restore process was added in advance on the basis of the original reptile code train, and the training code and data set were not changed.) It rose to 0.36 in the 200th round and 0.42 in the 3000th round. The accuracy of original model is about 0.46. The training method and data set used are exactly the same as those used to train the original model, and the loading of the model is the same. Don't know if this is normal and what causes it

yikeqingli commented 5 years ago

I evaluated the models that were not saved at the end of the training and loaded the models that were saved at the end of the training (both in the first round). I found that after the first round of training, the eval dropped by about 10%, and the model saved in the first round was loaded. About ten percent down again