How do you deal with the tail and head in FairNAS

Hi,

Thanks for your work. I was wondering how do you deal with gradient updates on the non-searchable stages of the model. The searchable layers will only be updated once, but multiple forward and backward passess would then go through the tail/stem and the detection head. Would you perhaps average the gradients ? or perhaps freeze the parameters of the non-searchable stages ?

xiaomi-automl / FairNAS

How do you deal with the tail and head in FairNAS #8