For DARTS complexity analysis, anyone have any idea how to derive the (k+1)*k/2 expression ? Why 2 input nodes ? How will the calculated value change if graph isomorphism is considered ? Why "2+3+4+5" learnable edges ? If there is lack of connection, the paper should not add 1 which does not actually contribute to learnable edges configurations at all ?
Why need to train the weights for normal cells and reduction cells separately as shown in Figures 4 and 5 below ?
How to arrange the nodes such that the NAS search will actually converge with minimum error ? Note: Not all nodes are connected to each and every other nodes
For DARTS complexity analysis, anyone have any idea how to derive the (k+1)*k/2 expression ? Why 2 input nodes ? How will the calculated value change if graph isomorphism is considered ? Why "2+3+4+5" learnable edges ? If there is lack of connection, the paper should not add 1 which does not actually contribute to learnable edges configurations at all ?
Why need to train the weights for normal cells and reduction cells separately as shown in Figures 4 and 5 below ?
How to arrange the nodes such that the NAS search will actually converge with minimum error ? Note: Not all nodes are connected to each and every other nodes
Why is GDAS 10 times faster than DARTS ?