could not reproduce the model pre-training ?

Sha-Lab / FEAT

The code repository for "Few-Shot Learning via Embedding Adaptation with Set-to-Set Functions"

MIT License

418 stars 84 forks source link

could not reproduce the model pre-training ? #10

Closed icoz69 closed 5 years ago

icoz69 commented 5 years ago

Hello I have tried to reproduce the model pre-training instead of using your provided parameters. However, I could not get ideal results. I wanna illustrate what I have done to see whether I was doing something wrong.

There are 64 training class with 600 images each. So I add a 64064 fc layers after the feature encoder and use this model to train a standard classification network with 64600 images. For validation, I drop the fc layers and use the encoder as the prototypical network to monitor the performance on 16 val classes. The val performance could at most reach 46 % at pre-training stage. Training loss and acc could easily overfit to 0 and 1. When I use this encoder to initialize the FEAT, the val performance could at most reach 50.3. So I am wondering if I understand your pre-training stage correctly. Thank you very much.

icoz69 commented 5 years ago

Now I got a closer result to the reported performance finally. it seems learning rate decay and data augmentation (random crop) are important in pre-training.

flexibility2 commented 5 years ago

@icoz69 , hello, I also could not reproduce the model pre-training, and I wonder if you could share this code? I will be very grateful for that, thanks a lot!

icoz69 commented 5 years ago

@icoz69 , hello, I also could not reproduce the model pre-training, and I wonder if you could share this code? I will be very grateful for that, thanks a lot!

@flexibility2 Hello, the training is basically a standard classification with 64 classes of images, which is very common. for data augmentation, I simply apply random crop. lr is 0.1 initially and times 0.1 every 10 epoch. As I have changed a lot of Network API, my pre-training code can not be run alone.

flexibility2 commented 5 years ago

@icoz69 ok, thank you for the same! And are you chinese? You know, I also major in the few-shot learning field, can I chat with you by Wechat or QQ?

icoz69 commented 5 years ago

@icoz69 ok, thank you for the same! And are you chinese? You know, I also major in the few-shot learning field, can I chat with you by Wechat or QQ?

@flexibility2 hi, you can drop me an email if you have questions. chi007@e.ntu.edu.sg

kikyou123 commented 5 years ago

@icoz69 Can you share the batch size and other hyperparameters about the pretraining code.