Doubts about the definition

Hi, @fusiming3

BN layers are not trained but recalibrated before evaluation. The BN statistics (running_mean μ and runningvar ${\sigma}^{2}$ ) accumulated during supernet training are not correct. This is because the inputs could come from different paths and also the statistics are not updated in every step. A common technique to fix this is BN recalibration, which is defined here: https://github.com/changlin31/DNA/blob/570c708c950e8bf0a7d5f3dc949163ceb5e49b0a/searching/timm/utils.py#L338 When performing BN recalibration (or BN correction), the statistics (μ and ${\sigma}^{2}$ ) of BN layers are reset and recalculated with part of the training set while the weights (γ) and the bias (β) remain unchanged. After that, the model is set to eval mode and perform evaluation normally. This method is called multiple times in _potential().
model_pool is a list containing encodings of all possible paths. This list is generated by https://github.com/changlin31/DNA/blob/570c708c950e8bf0a7d5f3dc949163ceb5e49b0a/searching/dna/distill_train.py#L865 and is used when sampling paths for training. By default, guide_input=True, so the model pool is discarded and regenerated before each stage. https://github.com/changlin31/DNA/blob/570c708c950e8bf0a7d5f3dc949163ceb5e49b0a/searching/dna/distill_train.py#L50-L51

changlin31 / DNA