khanrc / pt.darts

PyTorch Implementation of DARTS: Differentiable Architecture Search
MIT License
439 stars 108 forks source link

Hessian computation issue (seems like there is a bug) #43

Open MartinXPN opened 3 years ago

MartinXPN commented 3 years ago

The original implementation of DARTS seems to be dividing the (positive - negative) by (2 * eps): Code

return [(x-y).div_(2*R) for x, y in zip(grads_p, grads_n)]

In the implementation of pt.darts, seems like the braces for (2 * eps) are omitted. Which leads to wrong values of hessian and the final expression is evaluated as eps * (positive - negative) / 2: https://github.com/khanrc/pt.darts/blob/48e71375c88772daac376829fb4bfebc4fb78144/architect.py#L108

Shouldn't the expression be (2 * eps) instead of 2 * eps?