dlfivefifty / NumericalFreeProbability.jl

A Julia package for compute free convolutions
MIT License
0 stars 1 forks source link

Inverting Cauchy Transform for any measure #3

Open jamiecjx opened 1 year ago

jamiecjx commented 1 year ago

I have tried the inversion algorithm we discussed yesterday, and it seems the idea does work overall (bar one problem, I can invert a Square root measure)

There's one main problem I'm having at the moment

When we impose the extra condition $\begin{bmatrix} 1 & 0 & \hdots & 0 \end{bmatrix}v = q_0(z)$ , this condition depends upon $z$. Once we know what this value is supposed to be, it is easy to find the inverses. Without knowing $q_0(z)$, we can only specify the solution to just the set of "eigenvectors", much like how in a regular eigenvalue problem you can specify the eigenvectors to be in subspaces.

In the inhomogenous case, they lie in hyperbolic curves, and I'm not sure how to deal with this yet.

Do we have a meeting tomorrow 10:30am to discuss this?

dlfivefifty commented 1 year ago

Computing q_0(z) is easy away from the support once we know the 3-term recurrence: if we denote the Jacobi matrix by X note that we have

(X - z*I) * [q_0(z),q_1(z),…] = e_0

A quick-and-dirty way then is to invert this system.

More generally, you just break it into its components as the measures we consider should be expressible in terms of known things... note all the Jacobi weights should be available here:

https://github.com/JuliaApproximation/SingularIntegrals.jl

jamiecjx commented 1 year ago

The thing is, this still depends upon knowing what $z$ is

In normal eigenvalue problems, $z$ can only be finitely many different numbers since they're just eigenvalues but for this inhomogenous case, $z$ can take on a continuum of values

dlfivefifty commented 1 year ago

Can we just find the roots of $Cf(z)/ q_0(z)$?

dlfivefifty commented 1 year ago

Oh wait, that would mess up the RHS of the recurrence....

dlfivefifty commented 1 year ago

I think there's a super simple solution: $q_n(z) -> 0$ exponentially fast so for all practical purposes we can replace the functional constraint with e_n' v = 0. Can you try that?

jamiecjx commented 1 year ago

Well, it seems that works (at least for the one square root measure I have right now)

With the current test case I have, the inverse is accurate to roughly 10^-6 relative error

I'll try it with some other cases now

dlfivefifty commented 1 year ago

Cool! The accuracy isn’t great but maybe we need to impose the condition on a higher n ( this would mean the rank 1 perturbation is not I. The bottom row)…

jamiecjx commented 1 year ago

I tried the algorithm with just a single newton iteration afterwards and it seems to perform much better (by evaluating the cauchy transform and it's derivatives using quadrature), and it's to machine precision (at least with the cases I tried)

Edit: more iterations are required when $G^{-1}$ is close to the support. Additionally, if $G^{-1}$ is extremely close to the support, $q_n$ doesn't go to 0 fast enough I think to make the condition imposed valid, and in these cases I cannot get an inverse out

dlfivefifty commented 1 year ago

Yes that’s a good approach too. The routine for computing Cauchy transforms should work with dual numbers (ForwardDiff.derivative ) so no need for quadrature….

there’s a minor concern about coalescing roots. This would only happen near branch points so might not come up in our application.

The other issue is roots near the support of the measure

jamiecjx commented 1 year ago

Sorry that I wrote this at an ungodly hour, but I have successfully inverted these measures using the combination of the comrade/colleague matrix + newton iterations at random points in the complex plane:

$K\sqrt{1-x^2}\cos(x)dx$ $K\sqrt{1-x^2}(1 + \frac{x}{2} + \frac{x^3}{6})dx$ $K(1-x)^2(x+1)^2 \exp(x)dx$

These were all done to machine precision (except when z = G^{-1}(w) lies very close to the support, in which case the algorithm fails). The algorithm fails because q_n(z) is not close to 0.

Currently, I have these things to think about:

jamiecjx commented 1 year ago

I reckon a way to work around the problem of finding $G^{-1}(w)$ when $G^{-1}$ lies close to the support is:

Consider a sequence $w_n$ (finite) converging to $w$ such that $w_n$ generally lie away from the support, then iteratively solve for $q_n (G^{-1}(w))$ to get better approximations for $q_n(z)$

(i.e. just some stepping to help the algorithm)

under this, I can calculate the inverses at pretty much any point, including close to the support (as long as the expansion of the density in OPs has a few non zero terms)

Choosing such a sequence is kind of arbitrary at the moment, but it seems generally you will need more points in the sequence if you're trying to find the inverse close to the support, or the expansion of the density in OPs has few terms.

Calculating inverse when theres only one or too few terms in the OP expansion is not really doable with the comrade matrix as far as I can see, and in this case it's better to just Joukowski transform and FFT+polynomial inversion

dlfivefifty commented 1 year ago

We might be able to think of the original system as a “nonlinear eigenvalues problem” for which there are a number of algorithms. Here’s a recent paper that might give some ideas:

https://arxiv.org/pdf/2305.01691.pdf

dlfivefifty commented 1 year ago

We might be able to think of the original system as a “nonlinear eigenvalues problem” for which there are a number of algorithms. Here’s a recent paper that might give some ideas:

https://arxiv.org/pdf/2305.01691.pdf

jamiecjx commented 1 year ago

It is possible to reformulate the problem as a NEP of the form $M(z)Q = 0$, where

gif

and solutions to this NEP are exactly the solutions to $zQ = AQ + b$, with $\Sigma Q = q_0(z)$.

It turns out you can use Beyn's contour integral method (mentioned in the paper you shared above) for solving this NEP. I am trying other methods out but some algorithms require evaluating $q_0(z)$ for a matrix valued input which seems problematic.

The implementation of Beyn's algorithm in Julia is found in NonlinearEigenproblems.jl It requires the input of a contour and can locate inverses that happen to lie in the interior of it. In NonlinearEigenproblems.jl, the implemented contours are just ellipses.

It seems to be pretty computationally expensive, when the contour is expanded.

dlfivefifty commented 1 year ago

evaluating $q_0(z)$ for a matrix valued input which seems problematic

For square-root measures or Legendre (including on multiple intervals) this is no problem as it just involves square-roots or logarithms of matrices (in julia sqrt(::Matrix) and log(::Matrix) compute the matrix square root).

In general if a matrix is diagonalisable one can just diagonalise anyways.

jamiecjx commented 1 year ago

I had another idea now

Given the NEP from before, which has an eigenvalue $z$ iff it is an inverse of the Cauchy transform.

We consider a transformed NEP. Previously, we were looking WLOG for roots in the upper half plane (conjugate if in lower half plane)

Conformally map the upper half plane to the unit disc, so that we are now looking for transformed roots in the unit disk.

Here, we can use beyn's contour method quite well, (just integrate in a circular contour of radius r, center 0, where r < 1. The larger r is, the greater the area where roots can be successfully found, but then more quadrature points are needed. In particular, theoretically, we are guaranteed to find ALL inverses as r -> 1)

In practice, it is still very difficult to invert points whose inverse happens to lie near the support. However I can get closer to the support by taking additional quadrature points.

For example, using the measure $K\sqrt{1-x^2}(1 + \frac{x}{2} + \frac{x^3}{6})dx$, I can invert points such as $w = G(0.1+0.01im)$ with no problem.

EDIT: I forgot, Joukowski map probably works better EDIT 2: Joukowski map is weird because it maps infinity to the interior of the unit disk, and there seems to be some problems with this. The matrix $M(\lambda)$ when $\lambda$ is very large, is almost rank deficient. Yet there is no such eigenvalue at infinity. An alternative approach might be to just map the right half plane $Re(z) > 1$ to the unit disk to cover the slit plane $C \setminus [-1,1]$.

dlfivefifty commented 1 year ago

I was thinking that a rectangular contour might help if we know exactly the form of the singularity but perhaps I’m missing something…

jamiecjx commented 1 year ago

When transforming the domain using Joukowski, the new NEP given in the form $M(J(z))$ has a singularity at 0, where some entries of $M(J(z))$ go to infinity. However, it appears the determinant of $M(J(z))$ has a limiting value, maybe this singularity could be removed? It seems to be the only thing stopping the usage of the Joukowski map, which would guarantee recovery of all inverses

Untitled

(logarithmic absolute value of the determinant of $M(J(z))$, where the black spot corresponds to the inverse, and the white spot is where the singularity happens)

jamiecjx commented 1 year ago
using LinearAlgebra, ClassicalOrthogonalPolynomials, SingularIntegrals
using Random

P = ChebyshevU()
p = x -> (x^3/6 + x/2 + 1) * 2/pi * √(1-x^2)
f = x -> x^3/6 + x/2 + 1
W = Weighted(P); p_expanded = expand(W, p);
x = axes(W, 1)
J = jacobimatrix(P);

f_k = expand(P, f)
W = Weighted(P); p_expanded = expand(W, p);
x = axes(W, 1)

function G(z)
    inv.(z .- x') * p_expanded
end

function q_n(z, n)
    P = ChebyshevU()
    f = x -> P[x,n+1] * sqrt(1-x^2) * 2/pi
    W = Weighted(P); p_expanded = expand(W, f);
    inv.(z .- x') * p_expanded
end

n=2
z=1.3+0.5im

w = G(z)

function q_0(z)
    2 * (z - sqrt(z-I) * sqrt(z+I))
end

function joukowski(x::ComplexF64)::ComplexF64
   (x + inv(x)) / 2
end
function joukowski(x::Matrix)::Matrix
   (x + inv(x)) * (1/2 * I)
end

function inversecauchytransformmatrices(w, J, f, n; q = 0, tol=10^-9, maxits=30)
    f_k = f.args[2][1:n+1]
    last = f.args[2][n+2]
    A = Matrix(J[1:n+1,1:n+1])
    A[end,:] -= f_k ./ last .* J[n+1, n+2]
    b = zeros(n+1) .+ 0im
    b[end] = w/last * J[n+1, n+2]
    b[1] = 1
    Σ = zeros(n+1)
    Σ[1] += 1
    A, b, Σ
end

A = inversecauchytransformmatrices(w, J', f_k, n)[1]
b = inversecauchytransformmatrices(w, J', f_k, n)[2]
Σ = inversecauchytransformmatrices(w, J', f_k, n)[3]

Q = [q_n(z, i) for i=0:n]

A1 = [
    0 Σ'
    b A
]

A2 = Diagonal([-(i != 0) for i=0:n+1])

A3 = [
    0 zeros(n+1)'
    A*Σ zeros(n+1, n+1)
]

A4 = [
    0 zeros(n+1)'
    -Σ zeros(n+1, n+1)
]

Jinv_p = z -> z - √(z - 1) * √(z + 1)

f1 = z -> one(z)
f2 = z -> joukowski(z)
f3 = z -> q_0(joukowski(z))
f4 = z -> joukowski(z) * q_0(joukowski(z))

AA = [A1, A2, A3, A4]
fii = [f1,f2,f3,f4]

T = λ -> (A1 * f1(λ) + A2 * f2(λ) + A3 * f3(λ) + A4 * f4(λ))

display(det(T(Jinv_p(1.3+0.5im))))

# Actual Beyn's contour method

m = 4
Random.seed!(10)
l = 4
V̂ = [randn() + randn()*im for i=1:m, j=1:l]

r = 0.4
N = 1000
k = 3
μ = 0.5

# r = 0.9
# N = 1000
# k = 3
# μ = 0

ϕ = t -> μ + r * exp(im * t)
dϕ = t -> im * r * exp(im * t)

A_0N = r/N * sum(inv(T(ϕ(2π*j/N))) * V̂ * exp(2pi*im * j / N) for j=0:N-1)
A_1N = μ * A_0N + r^2/N * sum(inv(T(ϕ(2π*j/N))) * V̂ * exp(4pi*im * j / N) for j=0:N-1)

V, S, W = svd(A_0N)

V_0 = V[1:m, 1:k]
W_0 = W[1:l, 1:k]
S_0 = Diagonal(S[1:k])

B = V_0' * A_1N * W_0 * S_0^-1

joukowski.(eigvals(B))

This is a sample of what's going on with the beyn contour integral algorithm. I can recover the inverses when the contour does not enclose the center of the unit disk (so for a circle with r = 0.4 and mu = 0.5) but in the case of r = 0.9 mu = 0, the algorithm fails.

dlfivefifty commented 1 year ago

Does treating it as an NEP actually help us versus just treating it as a root finding problem?

jamiecjx commented 1 year ago

From what I've tested

Treating it as an NEP theoretically allows all inverses to be recovered with contour integrals, but this is very computationally expensive and since we'd need to take inverses of a point cloud, may not be useful unless taking inverses which happen to lie near the support. There's a branch that contains the code I used to do this.

@time inversecauchytransform(1.009im, P, w, f, n, radius = 0.99, N=4000)
 1.601156 seconds (1.91 M allocations: 127.397 MiB, 2.36% gc time)
2-element Vector{Any}:
 -9.483694602322844e-13 - 0.4807201206241631im
  3.248226654864378e-13 - 0.009299480162999259im

Here, with enough points and high enough r<1, I can find points lying very close to the support, at the expense of taking way too long. Currently the implementation in the branch applies beyns method 4 times which is definitely not optimal.

Solving the Inhomogenous eigenvalue problem as we had before (and setting q_n(z) = 0) doesn't allow us to take the inverse when it lies close to the support (since then q_n(z) is not close to 0), but is way faster computationally. However, accuracy doesn't seem to be an issue when coupled with Newton iterations

z=0.1*im
w = G(z)
@time inversecauchytransform(w, J, f_k, G, dG, 1; q=0, tol=10^-15)
  0.001707 seconds (556 allocations: 33.156 KiB)

1-element Vector{Any}:
 0.0 + 0.32761352419641243im

Here I completely miss the inverse z = 0.1i, though 0.3276 is also an inverse. Tweaking the parameter q=0 to equal q_n(0.1im) allows the recovery of this inverse but again q_n(z) isn't known...

jamiecjx commented 1 year ago

Also, I went through the 2012 Preprint again and I'm still not entirely convinced by Theorem 2.2

I think I sent this paper before https://arxiv.org/pdf/1302.4532.pdf which deals with the convolution of a semicircle with a Jacobi measure. Specifically Lemma 2.7 says that if you consider $\mu_{sc} \boxplus \lambda \mu$ where $\mu$ is a Jacobi measure $\frac{1}{Z} (1+x)^a (1-x)^b \mathrm{d}x$ on $[-1,1]$, then there exist critical parameters $\lambda_1$ and $\lambda2$ such that for $|\lambda| > \lambda{1,2}$ there is no square root behaviour on the left/right side of the support of the convolution.

Additionally, their Lemma A.4 gives a formula for the critical parameter in terms of an integral.

Consider the convolution of a Standard Semicircle $\mu_{sc}$ on $[-2,2]$ with the jacobi measure $\mu = \frac{15}{16} (1-x)^2 (1+x)^2 \mathrm{d}x$. Then the critical parameters are $\lambda_1 = \lambda2 = \sqrt(5/2)$. So that $\mu{sc} \boxplus 3\mu$ does not have square root singularities.

Also, $\mu$ is Univalent on the whole slit complex plane which comes from https://arxiv.org/pdf/math/0306130.pdf

image

This is what the images of the support of each measure look like ($G{\mu{sc}}, G{3\mu}, G{\mu_{sc} \boxplus 3\mu}$ respectively), where the part I'm not sure about is the real valued curves touching the real axis tangentially, which doesn't happen when $\lambda$ is not above the critical value

image

Same image as before, but $\lambda = 1.5$ below the critical value.

dlfivefifty commented 1 year ago

Looks very convincing. Probably there’s an extra condition missing on the parameters allowed in the Jacobi measure?

jamiecjx commented 1 year ago

In the arxiv 1302 paper they stated that the Jacobi parameters a and b need to be \leq to 1 in order for square root behaviour to hold always (Lemma 2.7), similar conditions were enforced in https://arxiv.org/abs/1804.11199

jamiecjx commented 1 year ago

9d406e38a75b2cb1e3e93f4b121336d329586887

After some headbanging against a wall (type inference...), I linked a commit on a different branch that has the NEP contour method of inverting a Cauchy transform. I wrote my own Beyn Contour method just to suit my purposes more.

It is more accurate than the general FFT-companion matrix-based method and takes about the same time to execute. In particular, the FFT based method cannot achieve that level of accuracy (afaik).

julia> sqm = SquarerootMeasure(x -> x^3/6 + x/2 + 1); n=2; z=-0.5 + 0.1im; y = sqm.G(z);
julia> @btime inversecauchytransform(y, sqm, n; radius= 0.9, N = 1000)
  4.632 ms (22952 allocations: 8.01 MiB)
1-element Vector{Any}:
 -0.4999999999999883 + 0.10000000000000556im

julia> jm = JacobiMeasure(x -> (7x^2 + 1)/2, 2, 2); n=1; z=-0.5 + 0.1im; y = jm.G(z);
julia> @btime inversecauchytransform(y, jm, n; radius= 0.9, N = 1000)
  6.224 ms (42748 allocations: 6.65 MiB)
1-element Vector{Any}:
 -0.5000000000000075 + 0.09999999999999251im

julia> sqm = SquarerootMeasure(x -> (4x^2+1)/2); n=1; z=0.1im; y = sqm.G(z);
julia> @btime inversecauchytransform(y, sqm, n; radius= 0.9, N = 1000)
  4.174 ms (22968 allocations: 5.78 MiB)
2-element Vector{Any}:
 -6.145050687937699e-15 + 0.10000000000001238im
 1.0810235311851757e-14 + 0.3276135241963962im

Comparison with FFT-Companion using FFTW (for the 2nd example)

P = Jacobi(2,2)
p = x -> 15/16 * (x-1)^2 * (x+1)^2 * (7x^2 + 1)/2
f = x -> (7x^2 + 1)/2
W = Weighted(P); p_expanded = expand(W, p);
x = axes(W, 1)

function G(z)
    inv.(z .- x') * p_expanded
end

julia> y = G(-0.5 + 0.1im)
julia> @btime testinvcauchytransform(y; maxterms=100, tol=1.2, r=0.9, n=127)
    4.086 ms (16717 allocations: 1.29 MiB)
1-element Vector{ComplexF64}:
 -0.49999999997643735 + 0.09999999992963382im
dlfivefifty commented 1 year ago

The Vector{Any} are a big red flag: means its no where near as fast as it should be.

You can use @code_warntype to see where type inference goes wrong...

jamiecjx commented 1 year ago

After fixing the type inference and using StaticArrays.jl because we're inverting 1000 4x4 matrices... I also tested that the vast majority of the time/allocations now comes from evaluating $q_0(z)$ 1000 times

julia> sqm = SquarerootMeasure(x -> x^3/6 + x/2 + 1); n=2; z=-0.5 + 0.1im; y = sqm.G(z);

julia> @btime inversecauchytransform(y, sqm, n; radius= 0.9, N = 1000)
  2.516 ms (15052 allocations: 2.62 MiB)
1-element Vector{ComplexF64}:
 -0.4999999999999878 + 0.10000000000000489im

julia> jm = JacobiMeasure(x -> (7x^2 + 1)/2, 2, 2); n=1; z=-0.5 + 0.1im; y = jm.G(z);

julia> @btime inversecauchytransform(y, jm, n; radius= 0.9, N = 1000)
  4.292 ms (34841 allocations: 2.60 MiB)
1-element Vector{ComplexF64}:
 -0.5000000000000027 + 0.09999999999999179im

julia> sqm = SquarerootMeasure(x -> (4x^2+1)/2); n=1; z=0.1im; y = sqm.G(z);

julia> @btime inversecauchytransform(y, sqm, n; radius= 0.9, N = 1000)
  2.280 ms (15052 allocations: 1.73 MiB)
2-element Vector{ComplexF64}:
 -1.0658800231322508e-14 + 0.10000000000000689im
  1.7609577994613308e-14 + 0.32761352419640066im
jamiecjx commented 1 year ago

oh damn After using the SingularIntegrals functionality of evaluating on a vector of points

julia> sqm = SquarerootMeasure(x -> x^3/6 + x/2 + 1); n=2; z=-0.5 + 0.1im; y = sqm.G(z);

julia> @btime inversecauchytransform(y, sqm, n; radius= 0.9, N = 1000)
  866.800 μs (423 allocations: 961.69 KiB)
1-element Vector{ComplexF64}:
 -0.49999999999998257 + 0.09999999999999465im

julia> jm = JacobiMeasure(x -> (7x^2 + 1)/2, 2, 2); n=1; z=-0.5 + 0.1im; y = jm.G(z);

julia> @btime inversecauchytransform(y, jm, n; radius= 0.9, N = 1000)
  2.258 ms (4351 allocations: 711.64 KiB)
1-element Vector{ComplexF64}:
 -0.5000000000000047 + 0.0999999999999914im

julia> sqm = SquarerootMeasure(x -> (4x^2+1)/2); n=1; z=0.1im; y = sqm.G(z);

julia> @btime inversecauchytransform(y, sqm, n; radius= 0.9, N = 1000)
  744.300 μs (423 allocations: 590.25 KiB)
2-element Vector{ComplexF64}:
  3.8426583692352975e-15 + 0.10000000000001455im
 -4.4751317772311766e-15 + 0.32761352419638456im
jamiecjx commented 1 year ago

One thing I'm thinking about is whenever we expand a measures density in terms of orthogonal polynomials and truncating the sequence, can we prove that as we take more and more terms in the expansion, the solutions to the system $zQ = AQ + b$ where $A$ is the comrade matrix converge to the actual inverses?

I'm wondering if there's some standard technique to deal with this sort of thing

dlfivefifty commented 1 year ago

This is an interesting question. For standard companion matrices we have results: if an analytic function converges uniformly on the boundary of a set then it asymptotically has the same number of zeros and they converge to the true zeros (I think this follows from https://en.wikipedia.org/wiki/Rouché's_theorem)

Is that enough?

jamiecjx commented 1 year ago

Yes it does work (at least if the support of the measure is Compact) it also has the name of Hurwitz's theorem

Additionally, I thought that it might be possible to speed up inverting point clouds by

(Sherman Morrison formula)

dlfivefifty commented 1 year ago

using rank 1 updates instead of reinverting the matrix in the contour integral for each additional point

I really like this idea!

jamiecjx commented 1 year ago

All the matrices that need to be inverted are of "Comrade" form, they are tridiagonal with a rank 1 perturbation on the bottom row

There has to be some optimised routine for calculating inverses like these? I didn't find another package that had a special type for them.

dlfivefifty commented 1 year ago

Probably using sparse is good enough or you can use AlmostBandedMatrix in SemiseparableMatrices.jl

jamiecjx commented 1 year ago

I tried using AlmostBandedMatrix but most of the matrices are small, at most 20x20. I find that solving Ax=b is faster using AlmostBandedMatrix when n=100 or more

dlfivefifty commented 1 year ago

Oh yeah for matrices that small it'll be hard to beat dense linear algebra. Though I guess banded and tridiagonal do:

julia> n = 20; A = randn(n,n); x = randn(n); @btime A\x;
  3.774 μs (3 allocations: 3.69 KiB)

julia> n = 20; A = Tridiagonal(randn(n-1),randn(n),randn(n-1)); x = randn(n); @btime A\x;
  584.028 ns (8 allocations: 1.36 KiB)

julia> n = 20; A = BandedMatrix(-1 => randn(n-1), 0 => randn(n),1 => randn(n-1)); x = randn(n); @btime A\x;
  1.926 μs (8 allocations: 1.70 KiB)

Use Woodbury Formula might be faster than AlmostBandedMatrix (but the latter uses QR so is more stable)