MLP with ReLU activation is added to downstream tasks.
Folder re-organization.
Motivation:
Provide another evaluation method that doesn't necessarily expect linearly separable representations (in contrast to logistic regression).
Experiment idea: Correlation analysis between downstream task accuracies gained through linear and non-linear evaluations.
Quote on quote from https://arxiv.org/pdf/1901.09005:
Using a linear model for evaluating the quality of a representation requires that the information relevant to the evaluation task is linearly separable in representation space. This is not necessarily a prerequisite for a “useful” representation. Furthermore, using a more powerful model in the evaluation procedure might make the architecture choice for a self-supervised task less important. Hence, we consider an alternative evaluation scenario where we use a multi-layer perceptron (MLP) for solving the evaluation task, details of which are provided in Supplementary Material.
Figure 3 clearly shows that the MLP provides only marginal improvement over the linear evaluation and the
relative performance of various settings is mostly unchanged. We thus conclude that the linear model is adequate for evaluation purposes.
Basically, it is already shown that MLP doesn't add too much on top of logistic regression. Therefore it is enough to use lr to measure the quality of learned representations. However the paper quoted above doesn't provide any hypothesis testing. We can simply run n many linear and non-linear evaluations, then apply Spearman correlation on those accuracies to conclude the followings:
It is also the case in medical domain that MLP and LR results are correlated with each other.
Conducting a statistical testing on the correlation, provides a systematically robust support for the claim. (The idea of systematically testing a hypothesis that seems intuitive and has been present for a long time, is not something new: https://arxiv.org/pdf/1805.08974)
Contribution:
Motivation:
Basically, it is already shown that MLP doesn't add too much on top of logistic regression. Therefore it is enough to use lr to measure the quality of learned representations. However the paper quoted above doesn't provide any hypothesis testing. We can simply run
n
many linear and non-linear evaluations, then apply Spearman correlation on those accuracies to conclude the followings: