Open MartinXPN opened 3 years ago
The original implementation of DARTS seems to be dividing the (positive - negative) by (2 * eps): Code
(positive - negative)
(2 * eps)
return [(x-y).div_(2*R) for x, y in zip(grads_p, grads_n)]
In the implementation of pt.darts, seems like the braces for (2 * eps) are omitted. Which leads to wrong values of hessian and the final expression is evaluated as eps * (positive - negative) / 2: https://github.com/khanrc/pt.darts/blob/48e71375c88772daac376829fb4bfebc4fb78144/architect.py#L108
eps * (positive - negative) / 2
Shouldn't the expression be (2 * eps) instead of 2 * eps?
2 * eps
The original implementation of DARTS seems to be dividing the
(positive - negative)
by(2 * eps)
: CodeIn the implementation of pt.darts, seems like the braces for
(2 * eps)
are omitted. Which leads to wrong values of hessian and the final expression is evaluated aseps * (positive - negative) / 2
: https://github.com/khanrc/pt.darts/blob/48e71375c88772daac376829fb4bfebc4fb78144/architect.py#L108Shouldn't the expression be
(2 * eps)
instead of2 * eps
?