misleading error estimate for non-convergent integral (maxevals=10^7 reached)

unary-code commented 1 year ago

When I run the integral of f(x)="0.21702437*(0.6227007 + 1.4335294/(x + 1.6759317))/(x + 1.52924537 - 0.2316349/x)" from x=0 to x=1, by running this code in Python:

jl.eval("""

my_tuple = quadgk(x -> 0.21702437*(0.6227007 + 1.4335294/(x + 1.6759317))/(x + 1.52924537 - 0.2316349/x), 0, 1, rtol=0.01)
println(my_tuple)
""")

I consistently get a value of "0.17259607425281273" for my_tuple[1]. I would expect to get a value of infinity or undefined.

Interestingly, when I use PySR python package's eval_tree_array and I pass in a tree whose definition is:

x1 = Node(;feature=1)
tree = 0.21702437*(0.6227007 + 1.4335294/(x1 + 1.6759317))/(x1 + 1.52924537 - 0.2316349/x1)

tree = Node{Float32}(tree)

my_tuple = quadgk(x -> (eval_tree_array(tree, reshape([Float32(x)], 1, 1), options))[1][1], 0, 1, rtol=0.01)

I consistently get a value of "0.14588867513106551" for my_tuple[1]. Perhaps the difference between 0.17259 and 0.14588 is because I converted the tree to Float32 (instead of Float64).

If you open up Desmos.com and plot this function f(x) over x=0 to x=1, you will find that the function goes to negative infinity as x approaches 0.16 from the to-the-right direction, and the function goes to positive nfinity as x approaches 0.16 from the to-the-left direction.

And I'm pretty sure that in reality, the integral of this function is undefined. Although perhaps, the negative infinity cancels out the positive infinity, which means the value of the definite integral should be finite?

Thanks for all the help, in advance.

MartinMikkelsen commented 1 year ago

Have you looked into Cauchy principal value?

stevengj commented 1 year ago

Yes, that integrand has a simple pole at x ≈ 0.13886098968184463 (as well as two other poles outside of your $[0,1]$ interval … not at 0.16 — you may be misreading the plot), so it should be undefined, though it has a well-defined Cauchy principal value as @MartinMikkelsen mentions.

What's happening is that it's failing to converge — the integral value is basically oscillating around the Cauchy principal value as QuadGK refines the integral domain around the singularity, I think — but it's halting when it hits the default maxevals of 10^7. You can see this if you run quadgk_count to get an evaluation count:

julia> f(x) = 0.21702437*(0.6227007 + 1.4335294/(x + 1.6759317))/(x + 1.52924537 - 0.2316349/x);

julia> quadgk_count(f, 0,1)
(0.17259607425281273, 0.010904232497796203, 10000005)

If you pass in maxevals=typemax(Int) instead, to effectively remove this limit, then it seems to hang (though it might eventually hit an Inf and throw an error if you wait long enough, I'm not sure).

I'm not sure what the best thing to do would be. I'm not sure if there is a reliable way to detect that there is a problem here. I suppose one option would be for us to return Inf for the error estimate if the default maxevals is reached.

stevengj commented 1 year ago

For reference, the Cauchy principal value seems to be about 0.16 here. If we follow the example in the manual, we do:

First, you need to know the (nearly) exact pole $x_0$. I found it as x0 = 0.13886098968184463 to nearly machine precision using WolframAlpha; you could also use a Newton solver given an approximate location of the pole.
Second, you need the residue of the pole, i.e. the limit of $f(x) \times (x- x_0)$ as $x \to x0$. I did this with the Richardson.jl package via `res, = extrapolate(x -> f(x) * (x - x0), 0.3, x0=x0)`
Third, you need the integral with the pole subtracted. I did this with (integral,_,_) = quadgk_count(x -> f(x) - res/(x-x0), 0,1), which gives (0.11671583468777838, 1.0187684029716593e-13, 15) (note that it converges in only 15 evaluations — once the pole is subtracted, the function is very smooth in $[0,1]$!)
Fourth, you need to add the boundary term integral + res * log(abs((1-x0)/(0-x0))) as explained in the QuadGK tutorial example. This gives a Cauchy principal value of 0.15970640203323377

stevengj commented 1 year ago

Actually, now that I've calculated the Cauchy principal value, then it seems like quadgk's error estimate of ≈ 0.01 is actually pretty accurate in this example, if it is viewed as a difference between what quadgk returns and the principal value. I'm not sure how reliable that behavior is, however.

MartinMikkelsen commented 1 year ago

I've also calculated the Cauchy principal value but unfortunately didn't have time to post it. I followed a similar approach to you @stevengj and got 0.159706.

JuliaMath / QuadGK.jl

misleading error estimate for non-convergent integral (maxevals=10^7 reached) #82