Closed volgachen closed 4 years ago
The (0.0, 0.4, 0.7) are the initial dropout rates for stage 1, 2 and 3, respectively. For a specific stage, for example, stage 2, the dropout rate is gradually decayed from 0.4 to around 0. Hope this helps.
@chenxin061 Thank you! I find the corresponding code now. Is there any explanation for an increasing initial dropout rates?
@chenxin061 Thank you! I find the corresponding code now. Is there any explanation for an increasing initial dropout rates?
Very sorry for the late reply.
The reason for the increasing dropout rates lies in the property that a deeper search network is easier to fall into the trap of skip-connects than a shallow one so stronger regularization is needed.
@chenxin061 Thank you! I find the corresponding code now. Is there any explanation for an increasing initial dropout rates?
Very sorry for the late reply.
The reason for the increasing dropout rates lies in the property that a deeper search network is easier to fall into the trap of skip-connects than a shallow one so stronger regularization is needed.
Got it! Thank you.
Excuse me, In paper you have explained the reason of using a gradually-decayed dropout rate. However, it seems that in your experiments the dropout rate is increasing during searching(0.0, 0.4, 0.7). I wonder if I neglected some important details.