This PR basically removes the lower bound on the size of epsilon. It fixes the current problem for small x with some functions, like, e.g., f(x) = log(x). MWE of the issue:
julia> fun(x) = log(x)
fun (generic function with 1 method)
julia> x0 = 1e-50
1.0e-50
julia> Calculus.derivative(fun, x0, :central)
ERROR: DomainError with -6.0554544523933395e-6
It is not only a problem of throwing an error, it is also a problem for accuracy, as illustrated in @andreasnoack's slack example, reproduced here:
julia> [Calculus.derivative(log, x, :central)*x - 1 for x in [1e-2, 1e-3, 1e-4, 1e-5]] # gives a large relative error
4-element Array{Float64,1}:
1.2222802592276594e-7
1.2223111765852224e-5
0.0012249805130359892
0.1590499883576919
julia> function mycentraldiff(f, x) # no lower bound does not give such a large error
ϵ = sqrt(eps(x))
return (f(x + ϵ) - f(x - ϵ))/(2*ϵ)
end
mycentraldiff (generic function with 1 method)
julia> [mycentraldiff(log, x)*x - 1 for x in [1e-2, 1e-3, 1e-4, 1e-5]]
4-element Array{Float64,1}:
-2.5553159588298513e-10
0.0
0.0
-3.9739767032642703e-11
Note this PR is not the ideal solution. In particular, the test for the gradient of f(x) = sin(x[1]) + cos(x[2]) at [0, 0], which currently passes thanks to the lower bound, does not pass with this PR. (I thus changed the test to be at [1, 1] rather than [0, 0].)
My understanding is that even for simpler cases, e.g., for f(x) = x + a, where a is a "big", then epsilon must be big too in order to not be collapsed to 0 when added to a. (That means if epsilon just scales with x, the finite-difference will only work for x of a similar magnitude to a).
However, this seems like a rarer and more pathological case than the f(x) = log(x) case to me. So my preference would go to removing the max(1, abs(x)), hence this PR.
As noted by Jeffrey Sarnoff, an ideal solution should implement some lower bound. Maybe one of you knows of a better way to implement said lower bound so that it is not a problem in the cases like f(x) = log(x)?
Another thing noted by @andreasnoack is that maybe cbrt is not ideal in :central difference and may be replaced by sqrt (as in the example above). I am not sure about that either, but I was told to explicitly raise the issue here as well 😃.
[Note on why I would like this to be solved:] I want to be able to use Julia's state-of-the-art numerical differentiation tools (AD, dual and hyperdual numbers, etc.) and would like to be able to quantify how these tools perform vs state-of-the-art finite differences in some specific scientific applications. But this f(x) = log(x) issue is preventing me from doing this...
Last note, I edited the tests quite a bit (apart from the [0,0]-to-[1,1] change), but it was mostly to enclose each group of tests in a @testset of its own for me to be able to find where the tests failed more easily.
This PR basically removes the lower bound on the size of
epsilon
. It fixes the current problem for smallx
with some functions, like, e.g.,f(x) = log(x)
. MWE of the issue:It is not only a problem of throwing an error, it is also a problem for accuracy, as illustrated in @andreasnoack's slack example, reproduced here:
Note this PR is not the ideal solution. In particular, the test for the gradient of
f(x) = sin(x[1]) + cos(x[2])
at[0, 0]
, which currently passes thanks to the lower bound, does not pass with this PR. (I thus changed the test to be at[1, 1]
rather than[0, 0]
.) My understanding is that even for simpler cases, e.g., forf(x) = x + a
, wherea
is a "big", thenepsilon
must be big too in order to not be collapsed to0
when added toa
. (That means ifepsilon
just scales withx
, the finite-difference will only work forx
of a similar magnitude toa
).However, this seems like a rarer and more pathological case than the
f(x) = log(x)
case to me. So my preference would go to removing themax(1, abs(x))
, hence this PR.As noted by Jeffrey Sarnoff, an ideal solution should implement some lower bound. Maybe one of you knows of a better way to implement said lower bound so that it is not a problem in the cases like
f(x) = log(x)
?Another thing noted by @andreasnoack is that maybe
cbrt
is not ideal in:central
difference and may be replaced bysqrt
(as in the example above). I am not sure about that either, but I was told to explicitly raise the issue here as well 😃.[Note on why I would like this to be solved:] I want to be able to use Julia's state-of-the-art numerical differentiation tools (AD, dual and hyperdual numbers, etc.) and would like to be able to quantify how these tools perform vs state-of-the-art finite differences in some specific scientific applications. But this
f(x) = log(x)
issue is preventing me from doing this...Last note, I edited the tests quite a bit (apart from the
[0,0]
-to-[1,1]
change), but it was mostly to enclose each group of tests in a@testset
of its own for me to be able to find where the tests failed more easily.