Advice on reproducing Omniglot results?

Hi,

Your implementation is very nice and it seems to stay close to the paper. However, I can not reproduce the nice looking few shot samples from OMNIGLOT to MNIST.

This is what the paper shows: https://github.com/comRamona/Neural-Statistician/blob/master/output-omni1/figures/Screenshot%20from%202019-02-14%2023-33-54.png

These are my results from just running the code with the default parameters (which do match the ones in the paper, 300 epochs etc): https://github.com/comRamona/Neural-Statistician/blob/master/output-omni1/figures/13-02-2019-21:38:33-mnist-grid-300.png

I noticed in one of your older commits that you get a similar looking result to what I get: https://github.com/conormdurkan/neural-statistician/blob/dcc6796f94f6551df0126f0f45865012e96cea06/omniglot/output/figures/16-07-2017-16:30:52-mnist-grid-300.png?fbclid=IwAR115ZAiEteO6J90Sov1YmHCyxdNnw_lhA_prGPjzHwFbpJE038iX8xEArM

The only difference I have noticed so far is that the dilation operator doesn't seem to have been applied. ("We also randomly applied the dilation operator from computer vision as further data augmentation since we observed that the stroke widths are quite uniform in the OMNIGLOT data, whereas there is substantial variation in MNIST, this augmentation improved the visual quality of the few-shot MNIST samples considerably")

From your experience working on this, are you aware of any problems, or is there anything I could do to make the results closer to the output presented in the paper?

Any input is much appreciated.

It’s been a while since I worked on this so apologies if I’m not totally familiar.

I was primarily interested in the generative modeling aspect of the neural statistician, and spent most of my time trying to generate nice faces with a modified architecture. I would say that the lack of meta-learning and few-shot results in the repo is basically due to my own lack of time investment, and not to any particular difficulty with implementing them.

If I recall correctly, while my implementation is close to the original specification, it’s not quite exactly what’s outlined in the paper. I think it should be reasonably straightforward to identify where my efforts differ.

Unfortunately, I’ve moved on to other work, and the repo is in need of some love, at least to update it to the latest version of PyTorch. Reproducing the results exactly has been on my to-do list for a while, but I’ve just not had the time, and I’m not sure when I’ll be able to get around to it.

I wish you all the best!

On Feb 14, 2019 at 23:43, <Ramona Comanescu (mailto:notifications@github.com)> wrote:

Hi,

Your implementation is very nice and it seems to stay close to the paper. However, I can not reproduce the nice looking few shot samples from OMNIGLOT to MNIST.

This is what the paper shows: https://github.com/comRamona/Neural-Statistician/blob/master/output-omni1/figures/Screenshot%20from%202019-02-14%2023-33-54.png

These are my results from just running the code with the default parameters (which do match the ones in the paper, 300 epochs etc): https://github.com/comRamona/Neural-Statistician/blob/master/output-omni1/figures/13-02-2019-21:38:33-mnist-grid-300.png

I noticed in one of your older commits that you get a similar looking result to what I get: https://github.com/conormdurkan/neural-statistician/blob/dcc6796f94f6551df0126f0f45865012e96cea06/omniglot/output/figures/16-07-2017-16:30:52-mnist-grid-300.png?fbclid=IwAR115ZAiEteO6J90Sov1YmHCyxdNnw_lhA_prGPjzHwFbpJE038iX8xEArM

From your experience working on this, are you aware of any problems, or is there anything I could do to make the results closer to the output presented in the paper?

Any input is much appreciated.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub (https://github.com/conormdurkan/neural-statistician/issues/2), or mute the thread (https://github.com/notifications/unsubscribe-auth/AKPEcua-Noy9J1FE4B1b2bM2MGMPDF_Mks5vNfSygaJpZM4a8qKA).

conormdurkan / neural-statistician

Advice on reproducing Omniglot results? #2