DustinCarrion / cd-metadl

NeurIPS 2022 - Cross-domain meta-learning competition
https://metalearning.chalearn.org
Apache License 2.0
10 stars 2 forks source link

Testing #1

Closed madclam closed 2 years ago

madclam commented 2 years ago

General comment The tutorial is good for people who know a little bit about meta-learning. Otherwise it is a little bit difficult to follow. Hence, we could direct people to the white paper, if they do not understand something. In the white paper, we should explain better how the baselines work.

Beginner tutorial:

All good

Intermediate tutorial:

This part can be intimidating: explain the data will be stored in a temporary file on Google drive while you follow this tutorial and erased at the end, unless you choose to save it in your own drive. The download should take one one minute.

Explain that performing meta-learning is optional: MetaLearner can return a "hard-coded" learning algorithm (Learner).

Advanced tutorial:

[...] this section presents the formal definition for both terms. => Add this sentence for clarification: Tasks are individual mini classification problems defined in the few-shot learning setting we are considering in this challenge. The meta-testing data will always be split into tasks. However, you have a choice to either split the meta-training data also into tasks or split it into batches (similarly to what is usually done in "regular" learning problems).

Task: It represents a N-way k-shot task and it is defined as [...] => say again that the support set will have N classes (ways) and k examples per class, and the support set some examples of the same classes. IMPORTANT: in practice you have a balanced query set, but the participants should not be able to exploit that. Also they should not be able to learn from the query set. The API prevents that, right? ALSO: in real life, a task may not have the same number of examples for each class in the support set. We are not doing that, right?

the data contained in one task belongs extrictly => strictly

the meta-training split is composed by multiple datasets, => composed of

However, during the feedback and final phases you will have 10 datasets that you can use for meta-training and meta-validation. => In the final phase, we could provide set0 and set1 for meta-training

Batch: It is a collection of sampled examples from a dataset without enforcing a configuration. => It is unclear what you mean by "configuration", "dataset", "data". I would simply say:

Batch: It is a collection of sampled examples from the meta-training data. The meta-training data is first concatenates to create a single large dataset including all classes, from which batches of data are sampled, i.e., ξˆ°π‘‘π‘Ÿπ‘Žπ‘–π‘›=π‘π‘œπ‘›π‘π‘Žπ‘‘(1,…,ξˆ°π‘›) . As before, the public data [...]

After you define tasks and batches, perhaps you can also say that the methods differ in the way the preprocessing layers of the NN are training and the type of classifier used. Typically, people use in the top layer(s) a linear of fully connected 2-layer network, or a "prototypical network". The preprocessing layers (computing a feature embedding) are meta-trained wither in an "episodic" manner (making use of a split into tasks) or in a "regular" batch training manner.

In a nutshell, what Prototypical Networks does is:

  1. => Explain that in the figure the projection would be in 2 dimensions (2 features).

The description of prototypical networks does not explain how the training is done (episodic or batch).


Local testing:


API:

It is important that the API be as backward compatible as possible with the previous challenge so it is easy for previous challenge participants to enter or for new participants to use the code of previous winners.

madclam commented 2 years ago

.

madclam commented 2 years ago

.

DustinCarrion commented 2 years ago

All comments have been included