about test-train-split v.s. train=test=split for a batch

Dear @bio-mlhui

Indeed, in our original paper (NeuIPS '23), we directly implemented generalization-based objective by training a linear model on a training split and assessing its generalization on a test split. Such procedure requires backpropogating through the inner optimization (we did 300 inner iterations) to get reliable estimate of the objective and the gradients, thus will be costly.

However, in our ICML '24 paper we found that essentially the original optimization corresponds to finding a labeling that would induce the linear model with the highest margin (see Proposition 3.1 in the paper). This result also reveals that you don't really need to do these train-test splits, since maximizing the margin would imply better generalization (see Remark 3.3 in the paper). And this, in turn, allows for efficient optimization, since we, in practice, don't really need to backpropogate through the inner optimization process (See Efficient optimization paragraph in the paper), and also allows for using small number of inner iterations (i.e. we use 10 inner steps), overall leading to the efficient optimization.

Let me know if these clarifications helped.

Best, Artyom

mlbio-epfl / turtle

about test-train-split v.s. train=test=split for a batch #4