What is the correct way to sample a task in the regression case to avoid overfitting (current way might overfit)?

renesax14 commented 4 years ago

I want to raise awareness that the current way pytorch-meta (and perhaps related algorithms) sample tasks in regression might give skewed results (over optimistic).

If one sample a function as a task then the inner adaptation loop could overfit because it is only seeing examples from a specific function. This does not happen in classification because we sample N classes to define a task. But we do not sample N functions to sample a task in regression.

In the past I've made the mistake of defining a class as a single class and I got an accuracy of 100% every time in the query set (even when the raw image was fully random). This was because the inner loop just change the params to output that class label no matter what (since it only saw example from that class).

I wonder if this might a similar problem for regression. Perhaps sampling N functions might be safer and give more realistic results?

I documented this in this SO question:

https://stats.stackexchange.com/questions/472353/what-is-the-standard-way-to-define-task-in-meta-learning-for-few-shot-classifi

Code where torch-meta samples a single task as a function:

https://github.com/tristandeleu/pytorch-meta/blob/5ab670d674bf1bf063e206dd046f44d73deaec25/torchmeta/toy/sinusoid.py#L83

tristandeleu commented 4 years ago

I think there is indeed some confusion in what you describe as to what we call a task, and the process of adaptation.

The goal of the inner-loop (the adaptation) is to produce a predictor adapted to one specific task. The algorithm during the inner-loop only sees examples from that one task it is trying to solve, and no other task. Seeing examples from that specific task doesn't mean that you are overfitting, this is simply the goal of the inner-loop
In the context of regression (at least for the toy regression tasks available in Torchmeta), a task is a function. Given a task/function f, you can sample data to put in your support set as pairs (x, f(x)). The goal of the inner-loop here is to recover this function f using data from the training/support set. The one specificity of Torchmeta is that there is a one-to-one correspondence between a task (the function) and a dataset: if you pick the same task twice (this has zero probability of happening), then you'll get the same support and query datasets. This is to ensure reproducibility.

I suggest you have a look at the ICML 2019 tutorial on meta-learning, this might make things clearer. These general questions are not specific to Torchmeta, and I suggest you ask them on Stack/MathOverflow (like the question you linked to) where you'll probably get more answers.

renesax14 commented 4 years ago

Just two comment:

I understand the point of inner loop adaptation.
I guess we just disagree on the definition of tasks for regression (wether Chelsea defines it the way you did or not, I am arguing is wrong).

Nevertheless, thanks for the link to the tutorial, looks useful (though from skimming it seems they are re-using slides from talks I've already seen).

tristandeleu / pytorch-meta

What is the correct way to sample a task in the regression case to avoid overfitting (current way might overfit)? #84