Open jamiecjx opened 1 year ago
Computing q_0(z)
is easy away from the support once we know the 3-term recurrence: if we denote the Jacobi matrix by X note that we have
(X - z*I) * [q_0(z),q_1(z),…] = e_0
A quick-and-dirty way then is to invert this system.
More generally, you just break it into its components as the measures we consider should be expressible in terms of known things... note all the Jacobi weights should be available here:
The thing is, this still depends upon knowing what $z$ is
In normal eigenvalue problems, $z$ can only be finitely many different numbers since they're just eigenvalues but for this inhomogenous case, $z$ can take on a continuum of values
Can we just find the roots of $Cf(z)/ q_0(z)$?
Oh wait, that would mess up the RHS of the recurrence....
I think there's a super simple solution: $q_n(z) -> 0$ exponentially fast so for all practical purposes we can replace the functional constraint with e_n' v = 0
. Can you try that?
Well, it seems that works (at least for the one square root measure I have right now)
With the current test case I have, the inverse is accurate to roughly 10^-6 relative error
I'll try it with some other cases now
Cool! The accuracy isn’t great but maybe we need to impose the condition on a higher n ( this would mean the rank 1 perturbation is not I. The bottom row)…
I tried the algorithm with just a single newton iteration afterwards and it seems to perform much better (by evaluating the cauchy transform and it's derivatives using quadrature), and it's to machine precision (at least with the cases I tried)
Edit: more iterations are required when $G^{-1}$ is close to the support. Additionally, if $G^{-1}$ is extremely close to the support, $q_n$ doesn't go to 0 fast enough I think to make the condition imposed valid, and in these cases I cannot get an inverse out
Yes that’s a good approach too. The routine for computing Cauchy transforms should work with dual numbers (ForwardDiff.derivative
) so no need for quadrature….
there’s a minor concern about coalescing roots. This would only happen near branch points so might not come up in our application.
The other issue is roots near the support of the measure
Sorry that I wrote this at an ungodly hour, but I have successfully inverted these measures using the combination of the comrade/colleague matrix + newton iterations at random points in the complex plane:
$K\sqrt{1-x^2}\cos(x)dx$ $K\sqrt{1-x^2}(1 + \frac{x}{2} + \frac{x^3}{6})dx$ $K(1-x)^2(x+1)^2 \exp(x)dx$
These were all done to machine precision (except when z = G^{-1}(w) lies very close to the support, in which case the algorithm fails). The algorithm fails because q_n(z) is not close to 0.
Currently, I have these things to think about:
If our expansion of the density function in terms of OPs only results in only 1 non-zero coefficient (such as in the regular semicircle law), then the comrade matrix is 0x0, which is a problem. Additionally, if there aren't many nonzero coefficients in the expansion, then the matrix isn't very big. This is a problem because then q_n(z) isn't very small, so setting it to 0 isn't good enough.
Is there a way to get a better approximation of what q_n(z) is, particularly when z is close to the support => q_n(z) is not 0 (some asymptotic formulas maybe?). A better approximation of q_n(z) (or maybe even q_0(z) would hopefully extend the domain where the algorithm remains valid.
I reckon a way to work around the problem of finding $G^{-1}(w)$ when $G^{-1}$ lies close to the support is:
Consider a sequence $w_n$ (finite) converging to $w$ such that $w_n$ generally lie away from the support, then iteratively solve for $q_n (G^{-1}(w))$ to get better approximations for $q_n(z)$
(i.e. just some stepping to help the algorithm)
under this, I can calculate the inverses at pretty much any point, including close to the support (as long as the expansion of the density in OPs has a few non zero terms)
Choosing such a sequence is kind of arbitrary at the moment, but it seems generally you will need more points in the sequence if you're trying to find the inverse close to the support, or the expansion of the density in OPs has few terms.
Calculating inverse when theres only one or too few terms in the OP expansion is not really doable with the comrade matrix as far as I can see, and in this case it's better to just Joukowski transform and FFT+polynomial inversion
We might be able to think of the original system as a “nonlinear eigenvalues problem” for which there are a number of algorithms. Here’s a recent paper that might give some ideas:
We might be able to think of the original system as a “nonlinear eigenvalues problem” for which there are a number of algorithms. Here’s a recent paper that might give some ideas:
It is possible to reformulate the problem as a NEP of the form $M(z)Q = 0$, where
and solutions to this NEP are exactly the solutions to $zQ = AQ + b$, with $\Sigma Q = q_0(z)$.
It turns out you can use Beyn's contour integral method (mentioned in the paper you shared above) for solving this NEP. I am trying other methods out but some algorithms require evaluating $q_0(z)$ for a matrix valued input which seems problematic.
The implementation of Beyn's algorithm in Julia is found in NonlinearEigenproblems.jl It requires the input of a contour and can locate inverses that happen to lie in the interior of it. In NonlinearEigenproblems.jl, the implemented contours are just ellipses.
It seems to be pretty computationally expensive, when the contour is expanded.
evaluating $q_0(z)$ for a matrix valued input which seems problematic
For square-root measures or Legendre (including on multiple intervals) this is no problem as it just involves square-roots or logarithms of matrices (in julia sqrt(::Matrix)
and log(::Matrix)
compute the matrix square root).
In general if a matrix is diagonalisable one can just diagonalise anyways.
I had another idea now
Given the NEP from before, which has an eigenvalue $z$ iff it is an inverse of the Cauchy transform.
We consider a transformed NEP. Previously, we were looking WLOG for roots in the upper half plane (conjugate if in lower half plane)
Conformally map the upper half plane to the unit disc, so that we are now looking for transformed roots in the unit disk.
Here, we can use beyn's contour method quite well, (just integrate in a circular contour of radius r, center 0, where r < 1. The larger r is, the greater the area where roots can be successfully found, but then more quadrature points are needed. In particular, theoretically, we are guaranteed to find ALL inverses as r -> 1)
In practice, it is still very difficult to invert points whose inverse happens to lie near the support. However I can get closer to the support by taking additional quadrature points.
For example, using the measure $K\sqrt{1-x^2}(1 + \frac{x}{2} + \frac{x^3}{6})dx$, I can invert points such as $w = G(0.1+0.01im)$ with no problem.
EDIT: I forgot, Joukowski map probably works better EDIT 2: Joukowski map is weird because it maps infinity to the interior of the unit disk, and there seems to be some problems with this. The matrix $M(\lambda)$ when $\lambda$ is very large, is almost rank deficient. Yet there is no such eigenvalue at infinity. An alternative approach might be to just map the right half plane $Re(z) > 1$ to the unit disk to cover the slit plane $C \setminus [-1,1]$.
I was thinking that a rectangular contour might help if we know exactly the form of the singularity but perhaps I’m missing something…
When transforming the domain using Joukowski, the new NEP given in the form $M(J(z))$ has a singularity at 0, where some entries of $M(J(z))$ go to infinity. However, it appears the determinant of $M(J(z))$ has a limiting value, maybe this singularity could be removed? It seems to be the only thing stopping the usage of the Joukowski map, which would guarantee recovery of all inverses
(logarithmic absolute value of the determinant of $M(J(z))$, where the black spot corresponds to the inverse, and the white spot is where the singularity happens)
using LinearAlgebra, ClassicalOrthogonalPolynomials, SingularIntegrals
using Random
P = ChebyshevU()
p = x -> (x^3/6 + x/2 + 1) * 2/pi * √(1-x^2)
f = x -> x^3/6 + x/2 + 1
W = Weighted(P); p_expanded = expand(W, p);
x = axes(W, 1)
J = jacobimatrix(P);
f_k = expand(P, f)
W = Weighted(P); p_expanded = expand(W, p);
x = axes(W, 1)
function G(z)
inv.(z .- x') * p_expanded
end
function q_n(z, n)
P = ChebyshevU()
f = x -> P[x,n+1] * sqrt(1-x^2) * 2/pi
W = Weighted(P); p_expanded = expand(W, f);
inv.(z .- x') * p_expanded
end
n=2
z=1.3+0.5im
w = G(z)
function q_0(z)
2 * (z - sqrt(z-I) * sqrt(z+I))
end
function joukowski(x::ComplexF64)::ComplexF64
(x + inv(x)) / 2
end
function joukowski(x::Matrix)::Matrix
(x + inv(x)) * (1/2 * I)
end
function inversecauchytransformmatrices(w, J, f, n; q = 0, tol=10^-9, maxits=30)
f_k = f.args[2][1:n+1]
last = f.args[2][n+2]
A = Matrix(J[1:n+1,1:n+1])
A[end,:] -= f_k ./ last .* J[n+1, n+2]
b = zeros(n+1) .+ 0im
b[end] = w/last * J[n+1, n+2]
b[1] = 1
Σ = zeros(n+1)
Σ[1] += 1
A, b, Σ
end
A = inversecauchytransformmatrices(w, J', f_k, n)[1]
b = inversecauchytransformmatrices(w, J', f_k, n)[2]
Σ = inversecauchytransformmatrices(w, J', f_k, n)[3]
Q = [q_n(z, i) for i=0:n]
A1 = [
0 Σ'
b A
]
A2 = Diagonal([-(i != 0) for i=0:n+1])
A3 = [
0 zeros(n+1)'
A*Σ zeros(n+1, n+1)
]
A4 = [
0 zeros(n+1)'
-Σ zeros(n+1, n+1)
]
Jinv_p = z -> z - √(z - 1) * √(z + 1)
f1 = z -> one(z)
f2 = z -> joukowski(z)
f3 = z -> q_0(joukowski(z))
f4 = z -> joukowski(z) * q_0(joukowski(z))
AA = [A1, A2, A3, A4]
fii = [f1,f2,f3,f4]
T = λ -> (A1 * f1(λ) + A2 * f2(λ) + A3 * f3(λ) + A4 * f4(λ))
display(det(T(Jinv_p(1.3+0.5im))))
# Actual Beyn's contour method
m = 4
Random.seed!(10)
l = 4
V̂ = [randn() + randn()*im for i=1:m, j=1:l]
r = 0.4
N = 1000
k = 3
μ = 0.5
# r = 0.9
# N = 1000
# k = 3
# μ = 0
ϕ = t -> μ + r * exp(im * t)
dϕ = t -> im * r * exp(im * t)
A_0N = r/N * sum(inv(T(ϕ(2π*j/N))) * V̂ * exp(2pi*im * j / N) for j=0:N-1)
A_1N = μ * A_0N + r^2/N * sum(inv(T(ϕ(2π*j/N))) * V̂ * exp(4pi*im * j / N) for j=0:N-1)
V, S, W = svd(A_0N)
V_0 = V[1:m, 1:k]
W_0 = W[1:l, 1:k]
S_0 = Diagonal(S[1:k])
B = V_0' * A_1N * W_0 * S_0^-1
joukowski.(eigvals(B))
This is a sample of what's going on with the beyn contour integral algorithm. I can recover the inverses when the contour does not enclose the center of the unit disk (so for a circle with r = 0.4 and mu = 0.5) but in the case of r = 0.9 mu = 0, the algorithm fails.
Does treating it as an NEP actually help us versus just treating it as a root finding problem?
From what I've tested
Treating it as an NEP theoretically allows all inverses to be recovered with contour integrals, but this is very computationally expensive and since we'd need to take inverses of a point cloud, may not be useful unless taking inverses which happen to lie near the support. There's a branch that contains the code I used to do this.
@time inversecauchytransform(1.009im, P, w, f, n, radius = 0.99, N=4000)
1.601156 seconds (1.91 M allocations: 127.397 MiB, 2.36% gc time)
2-element Vector{Any}:
-9.483694602322844e-13 - 0.4807201206241631im
3.248226654864378e-13 - 0.009299480162999259im
Here, with enough points and high enough r<1, I can find points lying very close to the support, at the expense of taking way too long. Currently the implementation in the branch applies beyns method 4 times which is definitely not optimal.
Solving the Inhomogenous eigenvalue problem as we had before (and setting q_n(z) = 0) doesn't allow us to take the inverse when it lies close to the support (since then q_n(z) is not close to 0), but is way faster computationally. However, accuracy doesn't seem to be an issue when coupled with Newton iterations
z=0.1*im
w = G(z)
@time inversecauchytransform(w, J, f_k, G, dG, 1; q=0, tol=10^-15)
0.001707 seconds (556 allocations: 33.156 KiB)
1-element Vector{Any}:
0.0 + 0.32761352419641243im
Here I completely miss the inverse z = 0.1i, though 0.3276 is also an inverse. Tweaking the parameter q=0 to equal q_n(0.1im) allows the recovery of this inverse but again q_n(z) isn't known...
Also, I went through the 2012 Preprint again and I'm still not entirely convinced by Theorem 2.2
I think I sent this paper before https://arxiv.org/pdf/1302.4532.pdf which deals with the convolution of a semicircle with a Jacobi measure. Specifically Lemma 2.7 says that if you consider $\mu_{sc} \boxplus \lambda \mu$ where $\mu$ is a Jacobi measure $\frac{1}{Z} (1+x)^a (1-x)^b \mathrm{d}x$ on $[-1,1]$, then there exist critical parameters $\lambda_1$ and $\lambda2$ such that for $|\lambda| > \lambda{1,2}$ there is no square root behaviour on the left/right side of the support of the convolution.
Additionally, their Lemma A.4 gives a formula for the critical parameter in terms of an integral.
Consider the convolution of a Standard Semicircle $\mu_{sc}$ on $[-2,2]$ with the jacobi measure $\mu = \frac{15}{16} (1-x)^2 (1+x)^2 \mathrm{d}x$. Then the critical parameters are $\lambda_1 = \lambda2 = \sqrt(5/2)$. So that $\mu{sc} \boxplus 3\mu$ does not have square root singularities.
Also, $\mu$ is Univalent on the whole slit complex plane which comes from https://arxiv.org/pdf/math/0306130.pdf
This is what the images of the support of each measure look like ($G{\mu{sc}}, G{3\mu}, G{\mu_{sc} \boxplus 3\mu}$ respectively), where the part I'm not sure about is the real valued curves touching the real axis tangentially, which doesn't happen when $\lambda$ is not above the critical value
Same image as before, but $\lambda = 1.5$ below the critical value.
Looks very convincing. Probably there’s an extra condition missing on the parameters allowed in the Jacobi measure?
In the arxiv 1302 paper they stated that the Jacobi parameters a and b need to be \leq to 1 in order for square root behaviour to hold always (Lemma 2.7), similar conditions were enforced in https://arxiv.org/abs/1804.11199
9d406e38a75b2cb1e3e93f4b121336d329586887
After some headbanging against a wall (type inference...), I linked a commit on a different branch that has the NEP contour method of inverting a Cauchy transform. I wrote my own Beyn Contour method just to suit my purposes more.
It is more accurate than the general FFT-companion matrix-based method and takes about the same time to execute. In particular, the FFT based method cannot achieve that level of accuracy (afaik).
julia> sqm = SquarerootMeasure(x -> x^3/6 + x/2 + 1); n=2; z=-0.5 + 0.1im; y = sqm.G(z);
julia> @btime inversecauchytransform(y, sqm, n; radius= 0.9, N = 1000)
4.632 ms (22952 allocations: 8.01 MiB)
1-element Vector{Any}:
-0.4999999999999883 + 0.10000000000000556im
julia> jm = JacobiMeasure(x -> (7x^2 + 1)/2, 2, 2); n=1; z=-0.5 + 0.1im; y = jm.G(z);
julia> @btime inversecauchytransform(y, jm, n; radius= 0.9, N = 1000)
6.224 ms (42748 allocations: 6.65 MiB)
1-element Vector{Any}:
-0.5000000000000075 + 0.09999999999999251im
julia> sqm = SquarerootMeasure(x -> (4x^2+1)/2); n=1; z=0.1im; y = sqm.G(z);
julia> @btime inversecauchytransform(y, sqm, n; radius= 0.9, N = 1000)
4.174 ms (22968 allocations: 5.78 MiB)
2-element Vector{Any}:
-6.145050687937699e-15 + 0.10000000000001238im
1.0810235311851757e-14 + 0.3276135241963962im
Comparison with FFT-Companion using FFTW (for the 2nd example)
P = Jacobi(2,2)
p = x -> 15/16 * (x-1)^2 * (x+1)^2 * (7x^2 + 1)/2
f = x -> (7x^2 + 1)/2
W = Weighted(P); p_expanded = expand(W, p);
x = axes(W, 1)
function G(z)
inv.(z .- x') * p_expanded
end
julia> y = G(-0.5 + 0.1im)
julia> @btime testinvcauchytransform(y; maxterms=100, tol=1.2, r=0.9, n=127)
4.086 ms (16717 allocations: 1.29 MiB)
1-element Vector{ComplexF64}:
-0.49999999997643735 + 0.09999999992963382im
The Vector{Any}
are a big red flag: means its no where near as fast as it should be.
You can use @code_warntype
to see where type inference goes wrong...
After fixing the type inference and using StaticArrays.jl because we're inverting 1000 4x4 matrices... I also tested that the vast majority of the time/allocations now comes from evaluating $q_0(z)$ 1000 times
julia> sqm = SquarerootMeasure(x -> x^3/6 + x/2 + 1); n=2; z=-0.5 + 0.1im; y = sqm.G(z);
julia> @btime inversecauchytransform(y, sqm, n; radius= 0.9, N = 1000)
2.516 ms (15052 allocations: 2.62 MiB)
1-element Vector{ComplexF64}:
-0.4999999999999878 + 0.10000000000000489im
julia> jm = JacobiMeasure(x -> (7x^2 + 1)/2, 2, 2); n=1; z=-0.5 + 0.1im; y = jm.G(z);
julia> @btime inversecauchytransform(y, jm, n; radius= 0.9, N = 1000)
4.292 ms (34841 allocations: 2.60 MiB)
1-element Vector{ComplexF64}:
-0.5000000000000027 + 0.09999999999999179im
julia> sqm = SquarerootMeasure(x -> (4x^2+1)/2); n=1; z=0.1im; y = sqm.G(z);
julia> @btime inversecauchytransform(y, sqm, n; radius= 0.9, N = 1000)
2.280 ms (15052 allocations: 1.73 MiB)
2-element Vector{ComplexF64}:
-1.0658800231322508e-14 + 0.10000000000000689im
1.7609577994613308e-14 + 0.32761352419640066im
oh damn After using the SingularIntegrals functionality of evaluating on a vector of points
julia> sqm = SquarerootMeasure(x -> x^3/6 + x/2 + 1); n=2; z=-0.5 + 0.1im; y = sqm.G(z);
julia> @btime inversecauchytransform(y, sqm, n; radius= 0.9, N = 1000)
866.800 μs (423 allocations: 961.69 KiB)
1-element Vector{ComplexF64}:
-0.49999999999998257 + 0.09999999999999465im
julia> jm = JacobiMeasure(x -> (7x^2 + 1)/2, 2, 2); n=1; z=-0.5 + 0.1im; y = jm.G(z);
julia> @btime inversecauchytransform(y, jm, n; radius= 0.9, N = 1000)
2.258 ms (4351 allocations: 711.64 KiB)
1-element Vector{ComplexF64}:
-0.5000000000000047 + 0.0999999999999914im
julia> sqm = SquarerootMeasure(x -> (4x^2+1)/2); n=1; z=0.1im; y = sqm.G(z);
julia> @btime inversecauchytransform(y, sqm, n; radius= 0.9, N = 1000)
744.300 μs (423 allocations: 590.25 KiB)
2-element Vector{ComplexF64}:
3.8426583692352975e-15 + 0.10000000000001455im
-4.4751317772311766e-15 + 0.32761352419638456im
One thing I'm thinking about is whenever we expand a measures density in terms of orthogonal polynomials and truncating the sequence, can we prove that as we take more and more terms in the expansion, the solutions to the system $zQ = AQ + b$ where $A$ is the comrade matrix converge to the actual inverses?
I'm wondering if there's some standard technique to deal with this sort of thing
This is an interesting question. For standard companion matrices we have results: if an analytic function converges uniformly on the boundary of a set then it asymptotically has the same number of zeros and they converge to the true zeros (I think this follows from https://en.wikipedia.org/wiki/Rouché's_theorem)
Is that enough?
Yes it does work (at least if the support of the measure is Compact) it also has the name of Hurwitz's theorem
Additionally, I thought that it might be possible to speed up inverting point clouds by
(Sherman Morrison formula)
using rank 1 updates instead of reinverting the matrix in the contour integral for each additional point
I really like this idea!
All the matrices that need to be inverted are of "Comrade" form, they are tridiagonal with a rank 1 perturbation on the bottom row
There has to be some optimised routine for calculating inverses like these? I didn't find another package that had a special type for them.
Probably using sparse
is good enough or you can use AlmostBandedMatrix
in SemiseparableMatrices.jl
I tried using AlmostBandedMatrix
but most of the matrices are small, at most 20x20. I find that solving Ax=b is faster using AlmostBandedMatrix
when n=100 or more
Oh yeah for matrices that small it'll be hard to beat dense linear algebra. Though I guess banded and tridiagonal do:
julia> n = 20; A = randn(n,n); x = randn(n); @btime A\x;
3.774 μs (3 allocations: 3.69 KiB)
julia> n = 20; A = Tridiagonal(randn(n-1),randn(n),randn(n-1)); x = randn(n); @btime A\x;
584.028 ns (8 allocations: 1.36 KiB)
julia> n = 20; A = BandedMatrix(-1 => randn(n-1), 0 => randn(n),1 => randn(n-1)); x = randn(n); @btime A\x;
1.926 μs (8 allocations: 1.70 KiB)
Use Woodbury Formula might be faster than AlmostBandedMatrix
(but the latter uses QR so is more stable)
I have tried the inversion algorithm we discussed yesterday, and it seems the idea does work overall (bar one problem, I can invert a Square root measure)
There's one main problem I'm having at the moment
When we impose the extra condition $
\begin{bmatrix} 1 & 0 & \hdots & 0 \end{bmatrix}v = q_0(z)
$ , this condition depends upon $z$. Once we know what this value is supposed to be, it is easy to find the inverses. Without knowing $q_0(z)$, we can only specify the solution to just the set of "eigenvectors", much like how in a regular eigenvalue problem you can specify the eigenvectors to be in subspaces.In the inhomogenous case, they lie in hyperbolic curves, and I'm not sure how to deal with this yet.
Do we have a meeting tomorrow 10:30am to discuss this?