General comment
The tutorial is good for people who know a little bit about meta-learning. Otherwise it is a little bit difficult to follow. Hence, we could direct people to the white paper, if they do not understand something. In the white paper, we should explain better how the baselines work.
Beginner tutorial:
All good
Intermediate tutorial:
We will use the Public Data in this tutorial to better understand the difference between tasks and batches introduced in the beginner tutorial. For this reason, before continuing with the next sections, please execute the following cell to ensure that you have the Public Data. Take into account that this step will get all 10 public datasets (more than 66,000 files).
This part can be intimidating: explain the data will be stored in a temporary file on Google drive while you follow this tutorial and erased at the end, unless you choose to save it in your own drive. The download should take one one minute.
MetaLearner: It contains the meta-algorithm logic. The meta_fit(meta_train_generator, meta_valid_generator) method has to be overwritten with your own meta-learning algorithm. In general, a MetaLearner is meta-trained and returns a Learner, to be meta-tested.
Explain that performing meta-learning is optional: MetaLearner can return a "hard-coded" learning algorithm (Learner).
Learner:
save(path): Saves your Learner in the specified path.
load(path): Loads your Learner from the file(s) you created in save(path).
Why do we need to overwrite these methods: couldn't we just use saving the whole objet to a pickle?
Predictor: It contains the logic of your Learner to make predictions once it is fitted. The predict(query_set) encapsulates this step. => [...] The predict(query_set) method [...]
Data generators: it is not clear whether it is permitted to overwrite them. Often challenge participants like to re-write them. In fact the loader of MetaDelta was more efficient and in part why they won. We might want to borrow it from them.
The tensor with the support set images has the following shape: torch.Size([80, 3, 128, 128]) => It would be helpful to explain what the dimensions of the tensor are. Explain 1st dim = num ex = num_shots * num_ways and num_ways = num_labels. Explain how the split between training and validation data is done.
Advanced tutorial:
when you show the code, the text of the next cell is displaced to the right. Add new line?
It is easy to find the baselines now, nicely done!
Details:
[...] this section presents the formal definition for both terms. => Add this sentence for clarification:
Tasks are individual mini classification problems defined in the few-shot learning setting we are considering in this challenge. The meta-testing data will always be split into tasks. However, you have a choice to either split the meta-training data also into tasks or split it into batches (similarly to what is usually done in "regular" learning problems).
Task: It represents a N-way k-shot task and it is defined as [...] => say again that the support set will have N classes (ways) and k examples per class, and the support set some examples of the same classes.
IMPORTANT: in practice you have a balanced query set, but the participants should not be able to exploit that. Also they should not be able to learn from the query set. The API prevents that, right?
ALSO: in real life, a task may not have the same number of examples for each class in the support set. We are not doing that, right?
the data contained in one task belongs extrictly => strictly
the meta-training split is composed by multiple datasets, => composed of
However, during the feedback and final phases you will have 10 datasets that you can use for meta-training and meta-validation. => In the final phase, we could provide set0 and set1 for meta-training
Batch: It is a collection of sampled examples from a dataset without enforcing a configuration. => It is unclear what you mean by "configuration", "dataset", "data". I would simply say:
Batch: It is a collection of sampled examples from the meta-training data. The meta-training data is first concatenates to create a single large dataset including all classes, from which batches of data are sampled, i.e., ξ°π‘ππππ=ππππππ‘(ξ°1,β¦,ξ°π) . As before, the public data [...]
After you define tasks and batches, perhaps you can also say that the methods differ in the way the preprocessing layers of the NN are training and the type of classifier used. Typically, people use in the top layer(s) a linear of fully connected 2-layer network, or a "prototypical network". The preprocessing layers (computing a feature embedding) are meta-trained wither in an "episodic" manner (making use of a split into tasks) or in a "regular" batch training manner.
In a nutshell, what Prototypical Networks does is:
=> Explain that in the figure the projection would be in 2 dimensions (2 features).
The description of prototypical networks does not explain how the training is done (episodic or batch).
Local testing:
In this tutorial, we will use the Public Data to locally test => what is meant by "locally test": on your own computer or in your own drive? Specify. Can people perform experiments on Colab and save the results in their drive?
Clarify what the installation does and where the public data ends up. It should be configurable where data (and results) end up on the user's drive (even though you can put some default).
API:
It is important that the API be as backward compatible as possible with the previous challenge so it is easy for previous challenge participants to enter or for new participants to use the code of previous winners.
General comment The tutorial is good for people who know a little bit about meta-learning. Otherwise it is a little bit difficult to follow. Hence, we could direct people to the white paper, if they do not understand something. In the white paper, we should explain better how the baselines work.
Beginner tutorial:
All good
Intermediate tutorial:
This part can be intimidating: explain the data will be stored in a temporary file on Google drive while you follow this tutorial and erased at the end, unless you choose to save it in your own drive. The download should take one one minute.
Explain that performing meta-learning is optional: MetaLearner can return a "hard-coded" learning algorithm (Learner).
Learner: save(path): Saves your Learner in the specified path. load(path): Loads your Learner from the file(s) you created in save(path). Why do we need to overwrite these methods: couldn't we just use saving the whole objet to a pickle?
Predictor: It contains the logic of your Learner to make predictions once it is fitted. The predict(query_set) encapsulates this step. => [...] The predict(query_set) method [...]
Data generators: it is not clear whether it is permitted to overwrite them. Often challenge participants like to re-write them. In fact the loader of MetaDelta was more efficient and in part why they won. We might want to borrow it from them.
The tensor with the support set images has the following shape: torch.Size([80, 3, 128, 128]) => It would be helpful to explain what the dimensions of the tensor are. Explain 1st dim = num ex = num_shots * num_ways and num_ways = num_labels. Explain how the split between training and validation data is done.
Advanced tutorial:
[...] this section presents the formal definition for both terms. => Add this sentence for clarification: Tasks are individual mini classification problems defined in the few-shot learning setting we are considering in this challenge. The meta-testing data will always be split into tasks. However, you have a choice to either split the meta-training data also into tasks or split it into batches (similarly to what is usually done in "regular" learning problems).
Task: It represents a N-way k-shot task and it is defined as [...] => say again that the support set will have N classes (ways) and k examples per class, and the support set some examples of the same classes. IMPORTANT: in practice you have a balanced query set, but the participants should not be able to exploit that. Also they should not be able to learn from the query set. The API prevents that, right? ALSO: in real life, a task may not have the same number of examples for each class in the support set. We are not doing that, right?
the data contained in one task belongs extrictly => strictly
the meta-training split is composed by multiple datasets, => composed of
However, during the feedback and final phases you will have 10 datasets that you can use for meta-training and meta-validation. => In the final phase, we could provide set0 and set1 for meta-training
Batch: It is a collection of sampled examples from a dataset without enforcing a configuration. => It is unclear what you mean by "configuration", "dataset", "data". I would simply say:
Batch: It is a collection of sampled examples from the meta-training data. The meta-training data is first concatenates to create a single large dataset including all classes, from which batches of data are sampled, i.e., ξ°π‘ππππ=ππππππ‘(ξ°1,β¦,ξ°π) . As before, the public data [...]
After you define tasks and batches, perhaps you can also say that the methods differ in the way the preprocessing layers of the NN are training and the type of classifier used. Typically, people use in the top layer(s) a linear of fully connected 2-layer network, or a "prototypical network". The preprocessing layers (computing a feature embedding) are meta-trained wither in an "episodic" manner (making use of a split into tasks) or in a "regular" batch training manner.
In a nutshell, what Prototypical Networks does is:
The description of prototypical networks does not explain how the training is done (episodic or batch).
Local testing:
API:
It is important that the API be as backward compatible as possible with the previous challenge so it is easy for previous challenge participants to enter or for new participants to use the code of previous winners.