JuliaIntervals / IntervalRootFinding.jl

Library for finding the roots of a function using interval arithmetic
https://juliaintervals.github.io/IntervalRootFinding.jl/
Other
126 stars 26 forks source link

How to show there is no roots on unbounded domain? #149

Open newptcai opened 4 years ago

newptcai commented 4 years ago

Let's take the simplest example. If I take f(x) = x and I want to show that there is root of this function on 1 to infinity, how can I do this?

A more complicated example is

f(x, y)=sinh(x) - 1/2 * (cosh(x)/cosh(x-y))^(x/y)

Taking y equals 1/2*log(2), this function remains positive but goes to 0 as x goes to infinity. Is there anyway that we can show this with interval arithmetics?

dpsanders commented 4 years ago

You can use infinity as follows:

julia> f(x) = x
f (generic function with 2 methods)

julia> roots(f, 1..∞)   # type as \infty<TAB>.  Or use Inf
0-element Array{Root{Interval{Float64}},1}

The fact that the list of roots is empty means that there is no root.

The second question is much harder. One of the problems is that your function has x and y repeated multiple times, which is bad news for interval arithmetic (the "dependency problem"). The other is that there seem to be tricky cancellations.

dpsanders commented 4 years ago

Your function is particularly nasty due to those hyperbolic functions, e.g.

julia> @time roots(f, -5..6)
  0.691032 seconds (4.82 M allocations: 148.984 MiB, 2.00% gc time)
1-element Array{Root{Interval{Float64}},1}:
 Root([0.597845, 0.597846], :unique)

julia> @time roots(f, -5..7)
  5.044978 seconds (34.90 M allocations: 1.053 GiB, 1.84% gc time)
1-element Array{Root{Interval{Float64}},1}:
 Root([0.597845, 0.597846], :unique)

As you increase the right limit of the interval the time to prove that there is a unique root increases dramatically fast. This is basically because of the cancellation in your function. Is there a way to rewrite the function somehow?

In general, showing that a function is positive on a range using interval arithmetic should be "easy" by just evaluating the function over the interval, e.g.

julia> g(x) = x^2 - 2
f (generic function with 2 methods)

julia> g(3..4)
[7, 14]

This proves that the image of the function $f$ on the interval [3, 4] is contained in [7, 14] and hence is positive.

However, again the nature of your function makes this very difficult, e.g. for your function

julia> f(3..4)
[-2.66365e+06, 27.2897]

even though the true range of the function is much smaller than that! There are various methods to get around this; see e.g. https://github.com/JuliaReach/RangeEnclosures.jl for various methods.

For example, it may be possible to approximate your function using a Taylor series in some range and bound that more easily; see TaylorModels.jl. However, since the function converges to 0 and infinity, that will be problematic; that is something that you cannot prove using interval arithmetic.

dpsanders commented 4 years ago

Another option is to calculate the derivative and show that the derivative is negative over some region, implying that the function is decreasing there.

newptcai commented 4 years ago

So I transfer the problem to showing that the following is non-negative between e and infinity. This now is positive at exp(1) and has a limit 0 at infinity.

Can we show something like it's positive on something like (2 exp(1), \infty

1/2*t^2*(log(2)*log(t^2 - 1)/log(t) - 2*log(-2/(t^2 + 2) + 2))

I tried TaylorModels.jl. It approximates g(t) well on a small interval, but not on large intervals.

dpsanders commented 4 years ago

When I evaluate that function at exp(1) I don't get 0.

But it still seems to have the dependency problem. Basically when a variable is repeated more than once, interval arithmetic starts to give intervals that are too wide.

Kolaru commented 4 years ago

You can always try to force the use of smaller intervals to try to avoid the dependency problem. In your original example:

julia> f(x, y) = sinh(x) - 1/2 * (cosh(x)/cosh(x-y))^(x/y)
f (generic function with 1 method)

julia> f(x) = f(x, 0.5*log(2))
f (generic function with 2 methods)

julia> f(1..2)
[-329.129, 3.6074]

But if instead of trying (1 .. 2) directly we do 10000 steps, we can prove the function indeed remain positive.

julia> bounds = 1:0.0001:2
1.0:0.0001:2.0

julia> Xs = Interval.(bounds[1:end-1], bounds[2:end]) ;

julia> res = f.(Xs) ;

julia> minimum(inf.(res))
0.19284681924470437

It may take a while and may be more efficient to bisect the interval until the required precision is matched, rather than setting a constant step.

However, you won't be able to do it up to infinity since the function is not well define there

julia> f(Inf)
NaN
dpsanders commented 4 years ago

Basically this bisection is what IntervalOptimisation.jl does automatically. The problem is that the sinh and cosh give huge values as the argument x gets larger, whereas the result of the function is very small.

newptcai commented 4 years ago

I managed to transform the problem into this showing that

-(log(2*s + 1) - log(s + 1))*log(s)/log(2) + log(-s + 1)

is non-negative on [0..exp(-2)]. This is at most

-(log(2*s + 1) - log(s + 1))*log(1/2/exp(2))/log(2) + log(-s + 1)

which goes to 0 at 0. The derivative of this is

2*(s^2*log(2) + s*(2*log(2) + 1) - 1)/((2*s + 1)*(s + 1)*(s - 1)*log(2))

This we can use IntervalOptimisation.jl to show that it's strictly greater than 0 on the [0..exp(-2)].

So it seems that to prove such inequalities mathematically using interval arithmetic, one still has to

newptcai commented 4 years ago

Actually

f(s) = -(log(2*s + 1) - log(s + 1))*log(1/2/exp(2))/log(2) + log(-s + 1)

is well-defined on -exp(-2)..exp(2), so we can already show that there is only two roots on this domain by IntervalRootFinding

julia> roots(f, -exp(-2)..exp(2))
2-element Array{Root{Interval{Float64}},1}:
 Root([0.749847, 0.749848], :unique)
 Root([-1.44908e-09, 1.38232e-09], :unique)

We know analytically that f(0)=0. Then using IntervalArithmetic.jl

julia> (@interval f(exp(-2))) > 0
true

Therefore, we can conclude that f(x) >= 0 on 0 to e^-2.

dpsanders commented 4 years ago

Yes you will often need to "massage" your problem first. It's unfortunate. There is some work going on on symbolic manipulation which might help, but it will still need to be driven by a human.

Note that you can use ForwardDiff.jl to calculate numerically-exact bounds on the derivative of a function (with the usual caveats about the dependency problem when you are using intervals):

julia> f(s) = -(log(2*s + 1) - log(s + 1))*log(1/2/exp(2))/log(2) + log(-s + 1)
f (generic function with 1 method)

julia> ForwardDiff.derivative(f, 0.1..0.2)
[0.768384, 2.12672]
newptcai commented 4 years ago

I have written this up -- https://newptcai.github.io/proving-an-inequality-with-julia-and-interval-arithmetic.html

Hope it would be helpful for others. :)

newptcai commented 4 years ago

BTW, this is probably a bug in IntervalArithemtic

julia> a = @biginterval log(2)/2
[0.346573, 0.346574]₂₅₆

julia> f(x) = sinh(x) - 1/2 * (cosh(x)/cosh(x-a))^(x/a)
f (generic function with 1 method)

julia> f(50)
[3.32447e+07, 3.32448e+07]₂₅₆

I have tried Mathematica, Maple, and SageMath. All suggests that f(50) = 1.381*10^-20

Kolaru commented 4 years ago
julia> f(@biginterval(50), @biginterval(a))
[1.38165e-20, 1.38166e-20]₂₅₆

julia> f(big(50), @biginterval(a))
[1.38165e-20, 1.38166e-20]₂₅₆

cosh(x) and sinh(x) need to be computed with bigfloat precision. IntervalArithmetic has no way of knowing these values are the result of a computation and are not as accurate as machine precision.

In general IntervalArithmetic results are only guaranteed if all floating point number involved have been replaced by intervals.

newptcai commented 4 years ago

In my code above I used integer x=50 and a was a big interval. So couldn’t 50 be turned into big interval automatically? That was what I was expecting.

Kolaru commented 4 years ago

It will be turn into a big interval when it first interact with an interval. But sinh(x) and cosh(x) do not have any interval involved, so they are computed by julia default mechanism which convert Int to Float64. x - a on the other hand immediatly turn this x into a biginterval.

dpsanders commented 4 years ago

That's a great blog post!

The way Julia works, it just does what you tell it. You start off with an integer x and the first thing it does is compute sinh(x) for that integer x, by converting x to a Float64.

Only later do you ask it to subtract a from x, at which point it will convert x to an Interval.

newptcai commented 4 years ago

That's a great blog post!

The way Julia works, it just does what you tell it. You start off with an integer x and the first thing it does is compute sinh(x) for that integer x, by converting x to a Float64.

Only later do you ask it to subtract a from x, at which point it will convert x to an Interval.

I see. Thanks.

newptcai commented 4 years ago
julia> f(@biginterval(50), @biginterval(a))
[1.38165e-20, 1.38166e-20]₂₅₆

julia> f(big(50), @biginterval(a))
[1.38165e-20, 1.38166e-20]₂₅₆

cosh(x) and sinh(x) need to be computed with bigfloat precision. IntervalArithmetic has no way of knowing these values are the result of a computation and are not as accurate as machine precision.

In general IntervalArithmetic results are only guaranteed if all floating point number involved have been replaced by intervals.

We must be using different versions. I got this

julia> using IntervalArithmetic

julia> a = @biginterval log(2)/2
[0.346573, 0.346574]₂₅₆

julia> f(x, y) = sinh(x) - 1/2 * (cosh(x)/cosh(x-y))^(x/y)
f (generic function with 1 method)

julia> f(@biginterval(50), @biginterval(a))
[7.95003e-14, 7.95004e-14]₂₅₆

julia> f(@biginterval(50), a)
[7.95003e-14, 7.95004e-14]₂₅₆

julia> f(big(50), a)
[7.95003e-14, 7.95004e-14]₂₅₆
dpsanders commented 4 years ago

That's worrying. Which version and machine are you on?

newptcai commented 4 years ago

Status ~/.julia/environments/v1.4/Project.toml [6e4b80f9] BenchmarkTools v0.5.0 [59287772] Formatting v0.4.1 [f6369f11] ForwardDiff v0.10.10 [28b8d3ca] GR v0.48.0 [7073ff75] IJulia v1.21.1 [d1acc4aa] IntervalArithmetic v0.16.1 [138f1668] IntervalConstraintProgramming v0.12.0 [c7c68f13] IntervalOptimisation v0.4.1 [d2bf35a9] IntervalRootFinding v0.5.2 [47be7bcc] ORCA v0.3.1 [5fb14364] OhMyREPL v0.5.5 [f0f68f2c] PlotlyJS v0.13.1 [91a5bcdd] Plots v1.0.8 [d330b81b] PyPlot v2.9.0 [4a1f3257] RandomTree v0.1.0 [~/code/RandomTree] [295af30f] Revise v2.6.0 [90137ffa] StaticArrays v0.12.1 [123dc426] SymEngine v0.7.1 [24249f21] SymPy v1.0.19 [314ce334] TaylorModels v0.3.0 [d621b6e3] ValidatedNumerics v0.11.0

shell> lsb_release --all LSB Version: core-9.20170808ubuntu1-noarch:printing-9.20170808ubuntu1-noarch:security-9.20170808ubuntu1-noarch Distributor ID: Ubuntu Description: Ubuntu 18.04.4 LTS Release: 18.04 Codename: bionic

dpsanders commented 4 years ago

What happens if you update to the latest version of IntervalArithmetic?

I don't see why it should make a difference, but...

dpsanders commented 4 years ago

By the way, in Julia you can do

versioninfo()

to get info about your machine

dpsanders commented 4 years ago

Also you can test this without needing to update all packages by making a new directory and doing

]activate .

there, then adding IntervalArithmetic

dpsanders commented 4 years ago

OK weird, I see the same results as @newptcai when I make a new environment and load the released version of IntervalArithmetic

newptcai commented 4 years ago

OK weird, I see the same results as @newptcai when I make a new environment and load the released version of IntervalArithmetic

Yeah, I think I have the latest version.

dpsanders commented 4 years ago

What happens if you use the branch nonrecursive_powers?

]add IntervalArithmetic#nonrecursive_powers

dpsanders commented 4 years ago

Maybe there's a problem with the definition of ^?

newptcai commented 4 years ago

Maybe there's a problem with the definition of ^?

Yes. This branch is correct now.

dpsanders commented 4 years ago

OK thanks, will look into this.