Thanks for your impressive work and the released code!
I saw that in DARTS, BN is not learnable in the search phase. And the authors claim that
Learnable affine parameters in all batch normalizations are disabled during the search process to avoid rescaling the outputs of the candidate operations.
In contrast, BN is set to be learnable by default in the search phase of GDAS (if I didn't miss some important points).
Does the affine parameter have some effect on the search phase? Could you give me some hints?
Thanks in advance!
Thanks for your impressive work and the released code! I saw that in DARTS, BN is not learnable in the search phase. And the authors claim that
In contrast, BN is set to be learnable by default in the search phase of GDAS (if I didn't miss some important points). Does the affine parameter have some effect on the search phase? Could you give me some hints? Thanks in advance!