tristandeleu / pytorch-meta

A collection of extensions and data-loaders for few-shot learning & meta-learning in PyTorch
https://tristandeleu.github.io/pytorch-meta/
MIT License
1.98k stars 256 forks source link

How to use torchmeta to debug algorithms quickly? #73

Closed renesax14 closed 4 years ago

renesax14 commented 4 years ago

I recently learned that it takes 12-16 hours to run miniImagenet and omniglot. However, it's important to be able to debug custom made algorithms (and prototype quickly).

I suggest that we develop a data-set (perhaps a fraction of mini-Imagenet) s.t. we have a table of results of different algorithms and that it runs in 2 hours instead. That way we can debug meta-learning algorithms quicker without having to wait 16 hours.

I'm happy to help. What would be a good custom made data-set we could make?

Does 12 labels with 300 images from mini-imagenet sound like a good plan with the CNN from MAML as an initial benchmark?

(I also plan to profile your later since I reported already that there is a different implementation of the dataloader for mini-imagenet that took only 2.5 hours to run, perhaps we can figure out why it takes so long when it shouldn't perhaps. https://github.com/markdtw/meta-learning-lstm-pytorch)

tristandeleu commented 4 years ago

This 12-16 hours number I gave in a separate repository https://github.com/tristandeleu/pytorch-maml/issues/9#issuecomment-655433669 is for a specific implementation and a specific algorithm (MAML). This does not only concern the data-loading part, but data-loading + meta-training of the model + periodic evaluation; the data-loading part represents only a fraction of these 12-16 hours. I would be very careful comparing the running times of two different implementations of two different algorithms and conclude something about the data-loading. There could be some inefficiencies in the data-loading of Torchmeta, but this needs to be tested in isolation.

As for a special dataset for prototyping, I would like to keep in Torchmeta only standard datasets from the literature (as well as some special datasets like Pascal5i and TCGA). However feel free to use the library to create your own subset of MiniImagenet for prototyping; you can take inspiration from the current implementation of MiniImagenet. Finally Torchmeta does not provide any implementation of meta-learning algorithms (this is by design): it only provides the tools to implement these algorithms, together with some example scripts which are not meant to be full implementations. Having a table with results of different algorithms would be very interesting, but should be separate from this repository.

renesax14 commented 4 years ago

This 12-16 hours number I gave in a separate repository tristandeleu/pytorch-maml#9 (comment) is for a specific implementation and a specific algorithm (MAML). This does not only concern the data-loading part, but data-loading + meta-training of the model + periodic evaluation; the data-loading part represents only a fraction of these 12-16 hours. I would be very careful comparing the running times of two different implementations of two different algorithms and conclude something about the data-loading. There could be some inefficiencies in the data-loading of Torchmeta, but this needs to be tested in isolation.

As for a special dataset for prototyping, I would like to keep in Torchmeta only standard datasets from the literature (as well as some special datasets like Pascal5i and TCGA). However feel free to use the library to create your own subset of MiniImagenet for prototyping; you can take inspiration from the current implementation of MiniImagenet. Finally Torchmeta does not provide any implementation of meta-learning algorithms (this is by design): it only provides the tools to implement these algorithms, together with some example scripts which are not meant to be full implementations. Having a table with results of different algorithms would be very interesting, but should be separate from this repository.

meta-lstm is a much more complicated algorithm than MAML...it's very safe to say it's the data loader.

Just saying, I am grateful for code you provide for sure! But I do want to point out that the meta-lstm is more complicated than maml so its much more likely that its the data loader

tristandeleu commented 4 years ago

I understand that MAML is simpler than meta-LSTM, and there could definitely be something inefficient in the data-loading in Torchmeta. However as long as I don't know exactly where this inefficiency is coming from, I will not be able to fix it in Torchmeta. That's what I meant by having both data-loading parts be tested in isolation, as opposed to comparing wall-clock times.