Closed ioannisPApapadopoulos closed 4 months ago
I am of the opinion that we should probably rework the SemiclassicalJacobi package to use the Cholesky / QL stuff (probably just via the implementation in FastTransforms) or at least have that as an option. That was sort of the reason I started looking at alternative basis change approaches in the first place.
It's not obvious to me that it would be better to instead use Cholesky on the annulus directly - we only have preliminary results on how that method works for multivariate polynomials and expect the complexity to be a concern. If you can reduce it to a 1D basis change that's probably better.
Agreed. So how's https://github.com/JuliaApproximation/ClassicalOrthogonalPolynomials.jl/pull/63 coming along?
ok, makes sense. Should I look through https://github.com/JuliaApproximation/ClassicalOrthogonalPolynomials.jl/pull/63 to get an idea of what needs to be done for Semiclassical Jacobi?
Agreed. So how's JuliaApproximation/ClassicalOrthogonalPolynomials.jl#63 coming along?
That PR pretty much worked as it was until we got more ambitious with @MikaelSlevinsky and started wanting to do rational stuff too. It may be a bit stale at this point. That said, I think it's a better idea to archive that PR and just use the implementation in FastTransforms - we are already loading that package anyway so recreating it seems pointless, especially since we get the rational stuff as well then.
I can do that if we want it. Does the previous plan still make sense of just having something like DecompPolynomial existing in ClassicalOPs.jl which functions analogously to LanczosPolynomial from a user perspective? Or should we just make NonClassicalOrthogonalPolynomials.jl?
Note FastTransforms is C code so will only work with FloatXX
and possibly BigFloat
… it’s inevitable we’ll eventually want to autodiff
also I don’t know if his code caches in the way we need
I don’t know what you had in mind for DecompPolynomial
Ok if you want autodiff I'll see if I can dust off that PR and aim for a native Julia version instead. If I remember correctly it should pretty much work already.
PS this is where the Givens rotations are computed for the transforms. It computes the 2-step raising operators and the Jacobi matrices as well, so may be useful to see
If it’s easier to use FastTransforms we can just burn the auto diff bridge when we get to it…
I do agree LanczosPolynomial, CholeskyPolynomial, etc. could naturally live in a GeneralOrthogonalPolynomials.jl package
(It’s arguably LanczosPolynomial is redundant but we can keep it… probably should add a orthogonalpolynomial(w::AbstractQuasiVector)
routine alongside a orthogonalpolyomial(w::Function, domain)
. Or use Measures.jl???
@TSGut what's the status of the PR?
It depends on what we want it to do. We have to be a bit careful since Semiclassical Jacobi involves weights that vanish on the boundary. Are you relying on lowering/raising c specifically or a and b as well? (where a and b are jacobi and c is the semiclassical part)
a
and b
as well...
Well, at least it should beat Lanczos (which I guess is a low bar since that crashes for you). I am currently testing it for SemiclassicalOPs, can you give me approximate values that you think you will need as a test case? c high (~100) and a,b low like in the issue post here?
Yes a and b will remain low, i.e. always <5. Most often a,b = 0 or 1. The cap of c is essentially the cap in the degree we can consider. If we can push c up to 300, that would be amazing. I'd say ~150 is probably good enough to get machine precision in most examples.
Okay. I can see why Lanczos struggles with high c, Cholesky also struggles since the weight modification starts to look very close to non positive definite. But I am working on it, let's see if we can push c higher.
The positive definiteness only fails by a tiny bit, presumably due to float shenanigans, but this should not be an issue to get up to 150 and higher. A simple diagonal perturbation makes it succeed, maybe there is a more elegant way but for practical purposes I expect Cholesky will work fine for this purpose.
Here is it working up to c=300 with a diagonal 5*eps()
perturbation:
julia> W = (P \ ((t.-x).^300 .* P))
ℵ₀×ℵ₀ Clenshaw{Float64} with 99 degree polynomial:
0.00374743 -0.00644773 0.00821405 -0.0095271 0.0105193 -0.0112493 0.011751 -0.0120487 …
-0.00644773 0.0110943 -0.0141348 0.0163964 -0.0181072 0.0193681 -0.0202373 0.0207563
0.00821405 -0.0141348 0.0180117 -0.0208992 0.0230879 -0.0247065 0.0258291 -0.0265078
-0.0095271 0.0163964 -0.0208992 0.0242593 -0.0268141 0.0287129 -0.0300413 0.0308593
0.0105193 -0.0181072 0.0230879 -0.0268141 0.0296588 -0.0317871 0.0332928 -0.0342412
-0.0112493 0.0193681 -0.0247065 0.0287129 -0.0317871 0.0341055 -0.0357679 0.0368431 …
0.011751 -0.0202373 0.0258291 -0.0300413 0.0332928 -0.0357679 0.0375703 -0.0387703
-0.0120487 0.0207563 -0.0265078 0.0308593 -0.0342412 0.0368431 -0.0387703 0.0400931
0.0121624 -0.0209596 0.0267863 -0.0312164 0.034686 -0.0373866 0.0394237 -0.0408665
⋮ ⋮ ⋱
julia> R = cholesky(Symmetric(W)+I*5eps()).U
ℵ₀×ℵ₀ UpperTriangular{Float64, InfiniteLinearAlgebra.AdaptiveCholeskyFactors{Float64, BandedMatrices.BandedMatrix{Float64, Matrix{Float64}, Base.OneTo{Int64}}, LazyArrays.BroadcastMatrix{Float64, typeof(+), Tuple{ClassicalOrthogonalPolynomials.Clenshaw{Float64, SubArray{Float64, 1, Vector{Float64}, Tuple{Base.OneTo{Int64}}, true}, LazyArrays.BroadcastVector{Float64, typeof(inv), Tuple{LazyArrays.BroadcastVector{Float64, typeof(sqrt), Tuple{LazyArrays.BroadcastVector{Float64, typeof(*), Tuple{LazyArrays.BroadcastVector{Float64, typeof(/), Tuple{LazyArrays.BroadcastVector{Float64, typeof(/), Tuple{InfiniteArrays.InfStepRange{Float64, Float64}, InfiniteArrays.InfStepRange{Int64, Int64}}}, Float64}}, LazyArrays.BroadcastVector{Float64, typeof(/), Tuple{LazyArrays.BroadcastVector{Float64, typeof(/), Tuple{InfiniteArrays.InfStepRange{Float64, Float64}, InfiniteArrays.InfStepRange{Int64, Int64}}}, Float64}}}}}}}}, LazyArrays.BroadcastVector{Float64, typeof(-), Tuple{LazyArrays.BroadcastVector{Float64, typeof(/), Tuple{FillArrays.Fill{Float64, 1, Tuple{InfiniteArrays.OneToInf{Int64}}}, LazyArrays.BroadcastVector{Float64, typeof(sqrt), Tuple{LazyArrays.BroadcastVector{Float64, typeof(*), Tuple{LazyArrays.BroadcastVector{Float64, typeof(/), Tuple{LazyArrays.BroadcastVector{Float64, typeof(/), Tuple{InfiniteArrays.InfStepRange{Float64, Float64}, InfiniteArrays.InfStepRange{Int64, Int64}}}, Float64}}, LazyArrays.BroadcastVector{Float64, typeof(/), Tuple{LazyArrays.BroadcastVector{Float64, typeof(/), Tuple{InfiniteArrays.InfStepRange{Float64, Float64}, InfiniteArrays.InfStepRange{Int64, Int64}}}, Float64}}}}}}}}}}, LazyArrays.BroadcastVector{Float64, typeof(/), Tuple{LazyArrays.ApplyArray{Float64, 1, typeof(vcat), Tuple{Float64, LazyArrays.BroadcastVector{Float64, typeof(sqrt), Tuple{LazyArrays.BroadcastVector{Float64, typeof(*), Tuple{LazyArrays.BroadcastVector{Float64, typeof(/), Tuple{LazyArrays.BroadcastVector{Float64, typeof(/), Tuple{InfiniteArrays.InfStepRange{Float64, Float64}, InfiniteArrays.InfStepRange{Int64, Int64}}}, Float64}}, LazyArrays.BroadcastVector{Float64, typeof(/), Tuple{LazyArrays.BroadcastVector{Float64, typeof(/), Tuple{InfiniteArrays.InfStepRange{Float64, Float64}, InfiniteArrays.InfStepRange{Int64, Int64}}}, Float64}}}}}}}}, LazyArrays.BroadcastVector{Float64, typeof(sqrt), Tuple{LazyArrays.BroadcastVector{Float64, typeof(*), Tuple{LazyArrays.BroadcastVector{Float64, typeof(/), Tuple{LazyArrays.BroadcastVector{Float64, typeof(/), Tuple{InfiniteArrays.InfStepRange{Float64, Float64}, InfiniteArrays.InfStepRange{Int64, Int64}}}, Float64}}, LazyArrays.BroadcastVector{Float64, typeof(/), Tuple{LazyArrays.BroadcastVector{Float64, typeof(/), Tuple{InfiniteArrays.InfStepRange{Float64, Float64}, InfiniteArrays.InfStepRange{Int64, Int64}}}, Float64}}}}}}}}, LazyBandedMatrices.SymTridiagonal{Float64, FillArrays.Fill{Float64, 1, Tuple{InfiniteArrays.OneToInf{Int64}}}, LazyArrays.BroadcastVector{Float64, typeof(sqrt), Tuple{LazyArrays.BroadcastVector{Float64, typeof(*), Tuple{LazyArrays.BroadcastVector{Float64, typeof(/), Tuple{LazyArrays.BroadcastVector{Float64, typeof(/), Tuple{InfiniteArrays.InfStepRange{Float64, Float64}, InfiniteArrays.InfStepRange{Int64, Int64}}}, Float64}}, LazyArrays.BroadcastVector{Float64, typeof(/), Tuple{LazyArrays.BroadcastVector{Float64, typeof(/), Tuple{InfiniteArrays.InfStepRange{Float64, Float64}, InfiniteArrays.InfStepRange{Int64, Int64}}}, Float64}}}}}}}}, Diagonal{Float64, FillArrays.Fill{Float64, 1, Tuple{InfiniteArrays.OneToInf{Int64}}}}}}}} with indices OneToInf()×OneToInf():
0.0612162 -0.105327 0.134181 -0.15563 0.171838 -0.183763 0.19196 -0.196821 …
⋅ 0.000700142 -0.00267594 0.00620781 -0.0114248 0.0183286 -0.0268081 0.0366551
⋅ ⋅ 1.77306e-5 -0.000102831 0.000340672 -0.000850244 0.00177679 -0.00328023
⋅ ⋅ ⋅ 9.35676e-7 -5.72943e-6 2.16419e-5 -6.29065e-5 0.000153577
⋅ ⋅ ⋅ ⋅ 1.61278e-6 -9.847e-6 3.58956e-5 -9.99085e-5
⋅ ⋅ ⋅ ⋅ ⋅ 8.00777e-7 -4.12444e-6 1.34714e-5 …
⋅ ⋅ ⋅ ⋅ ⋅ ⋅ 5.53858e-7 -2.94058e-6
⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ 7.49623e-7
⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅
⋮ ⋮ ⋱
julia> maximum(abs.(R[1:100,1:100]'R[1:100,1:100]-W[1:100,1:100]))
1.1726730697603216e-15
Okay I have it working for Semiclassical Jacobi now but for an implementation that isn't full of weird workarounds for bugs we need to address the two bugs from which I referenced this issue above.
Nice, thank you! are you going to be making a PR?
Yes, working on tidying it up a bit. Though before merging we should fix the issues above so it's less clunky.
We should be using @MikaelSlevinsky's orthogonal versions:
That is, don't construct P^(0,0,100)(r^2) instead construct r^100 P^(0,0,100)(r^2)
Yes, I have been toying with using the QR version to get around the positive-definiteness breakdown -- it works as long as the square root of the weight modification is a polynomial. So for (a,b,c) = (0,0,100)
it is fine. To reach e.g. (a,b,c) = (0,0,101)
we would have to e.g. start from (a,b,c) = (0,0,1)
to make this work, which is okay but involves two basis changes which might be bad for error accumulation (but I suppose not as bad as an adhoc perturbation to get Cholesky working). Is that how you meant it?
Or did you mean just computing the weighted polynomials via the second line - in that case I don't see how that helps us since I thought we were looking for access to P^(0,0,100)? Q is also not as nice / fast to apply as R.
That's eactly what Mikael does: treats odd and even separately
We never need P^(0,0,100) and it should never be used since it will be ill-conditioned. What we want for annuli are the weighted ones
Q is also not as nice / fast to apply as R.
Note true. It's basically as fast (O(n)) since its a QR of a tridiagonal matrix
We never need P^(0,0,100) and it should never be used since it will be ill-conditioned. What we want for annuli are the weighted ones
Ok, that is a use-case thing, I was just going off of the issue description here. I don't know that I have access to the annulus offshoot of the Semiclassical paper if it exists already? Found it, will read through to refresh on use case.
Not true. It's basically as fast (O(n)) since its a QR of a tridiagonal matrix
It's the QR of a general banded matrix not tridiagonal -- the thing we apply QR to is not tridiagonal unless we are only a degree 2 polynomial away from the target basis (in which case the square root of the modification will be tridiagonal). Q is definitely fine in practice but I don't think it's as nice as banded-upper-triangular.
In any case, since you say we want the weighted bases, I will focus on that functionality.
QR of a banded is fine:
julia> A = brand(10_000_000, 10_000_000, 2, 1);
julia> Q,R = qr(A);
julia> x = randn(size(A,1));
julia> @time Q*x;
0.218650 seconds (2 allocations: 76.294 MiB, 38.30% gc time)
Also IT IS a SymTridiagonal
since sqrt(U) = t*I - X
!!
I think we're talking about different things, I was talking about the general procedure of using QR for polynomial measure modification but you're clearly talking about a concrete example.
I'm implementing the whole general case at the moment so that's what's on my mind. The example you're talking about is tridiagonal, yes.
@TSGut Ok, I can be more specific about what's failing. So there's a couple of things, but the main is instability in the lowering between weighted and unweighted annuli OPs. In particular, both of these fail for fairly quickly for moderate c (e.g. c=25)
c=25; SemiclassicalJacobi(t,1,1,c) \ SemiclassicalJacobi(t,0,0,c)
c=25; Weighted(SemiclassicalJacobi(t,0,0,c)) \ Weighted(SemiclassicalJacobi(t,1,1,c))
Note that actually the c parameter remains unchanged. So essentially we need to do a Cholesky factorisation for each c:
c = 25; c=25; X = jacobimatrix(SemiclassicalJacobi(t,0,0,c));
cholesky(X*(I-X)).
At the moment, the Jacobi matrix for SemiclassicalJacobi(t,0,0,c)
cannot be assembled even for c=25 via the Lanczos route. Is it any better via Cholesky? Most likely that is also unstable. What I think @dlfivefifty is referring to is that an alternative for the Jacobi matrix of SemiclassicalJacobi(t,0,0,c)
, we instead compute the Jacobi matrix of (t-x)^(c/2)*SemiclassicalJacobi(t,0,0,c)
. This would be computed via the QR factorisation and probably has better stability?
Yes either of those two things finish computing for me with my WIP Cholesky/QR implementation. Have you tried running it in my PR branch? Error reports would be appreciated.
I think I am getting reasonably close to everything working now.
What branches do I need? Just your branch in SemiclassicalOrthogonalPolynomials.jl gives me the error
julia> Ls[25]
ERROR: UndefVarError: cholesky_jacobimatrix not defined
You also need my PR branch of ClassicalOrthogonalPolynomials.jl. I believe that is it at the moment.
Seems to be working ok, though it initially still errors as you need to import cholesky_jacobimatrix
from ClassicalOrthogonalPolynomials.jl
in SemiclassicalOrthogonalPolynomials.jl
. I can push to higher c than I could with Lanczos. But still not high enough. Do you have an example of the sqrt-weighted semiclassical orthogonal polys?
I must be missing something about what is being asked of me here. :sweat_smile: I understand the bottom row of the table excerpt Sheehan posted but not what we want to do with it in the implementation, i.e. what am I supposed to compute using that connection? I don't think we want the Q
(though if so that would be easy) and as for
we instead compute the Jacobi matrix of
(t-x)^(c/2)*SemiclassicalJacobi(t,0,0,c)
. This would be computed via the QR factorisation and probably has better stability?
isn't S(x) = (t-x)^(c/2)*SemiclassicalJacobi(t,0,0,c)
just scaled Legendre for all even c
? We have
$$\int S_n(x) S_m(x) dx = \int (t-x)^{c} Q^{(0,0,c)}_n(x) Q^{(0,0,c)}_m(x) dx = \delta_n^m$$
so those things are orthogonal with respect to the uniform weight?
Oh and I think we will be able to push higher once everything is sorted. Just to double check, it really has to be lowering by 1 and we cannot use a trick to instead need lowering by 2 (or any even number)? Lowering and raising by even numbers is going to be far better thanks to QR. Not the end of the world if it can't be done but I thought it would be worth asking.
I don't exactly like it but if we cannot think of a better way, we can always first raise by (1,1,0) then lower by (2,2,0) and combine the raising and lowering operators. Since it would all be banded and we have access to both sides of the conversion maybe we can get away with it. I will run some tests on this.
Lastly, the issue with Cholesky vs QR is not "raw" stability per se, since Cholesky is actually more stable than QR last time I checked. But QR is better for these high parameter problems for us bc Cholesky fails catastrophically once positive definiteness is lost to approximation noise as opposed to gradually becoming worse and is thus more sensitive to noise.
It should definitely be QR, just like SH transform and Zernike transform. But its not "lowering by 2" we need. As John said:
we instead compute the Jacobi matrix of (t-x)^(c/2)*SemiclassicalJacobi(t,0,0,c)
and we're sure there's no typo here and you actually want that weighted basis? Is it obvious then why this isn't just scaled Legendre?
Isn't this a general fact due to uniqueness? If $Q_n(x)$ are OPs with respect to a weight $u(x)$ and if furthermore $\sqrt{u(x)}$ is a polynomial in $x$, then the polynomials defined by $S_n(x) := \sqrt{u(x)}Q(x)$ are just scaled Legendre (with a few terms missing) since
$$ \int_I S_n(x) S_m(x) dx = \int_I u(x) Q_n(x) Qm(x) dx = c{n,m} \delta_n^m $$
I assume there is a mistake in my reasoning here somewhere? In any case, computing the jacobimatrix for (t-x)^(c/2)*SemiclassicalJacobi(t,0,0,c)
isn't a problem. Is there functionality in this package at the moment for representing semiclassical OPs weighted by something that is not their full weight? HalfWeighted doesn't appear to quite mean the same thing. If not it's a bit of extra busy work but functionally this is fine.
EDIT: Hm, is it the lack of completeness that allows it to escape from being Legendre?
Ok I played around with the equivalent Jacobi polynomial case and get it now. Should be straightforward to implement. :)
This is the whole point of Mikaels stuff; (1-x^2) * P^(2,2) are orthonormal wrt Lebesgue so the conversion to Legendre must be orthogonal....
yep, I got it now, easy to implement.
Question is about the details now - should this be something like SqrtWeightedPolynomials struct generically in ClassicalOPs.jl? Or do we specifically do the Semiclassical version in SemiclassicalOPs?
If I understand correctly, I would summarize (big picture) our approach as follows, keeping in mind that SemiclassicalJacobi
is a means to bivariate annulus OPs:
SemiclassicalJacobi
hierarchy (this means a and b fixed and c = c_0:infty and usually c_0 = 0,+/-1,... or at least c_0 is close to 0.) This can be done adaptively as in the Julia implementation or for a fixed degree (that decrements down the hierarchy) as in the C implementation for the transform. That said, the computation should be exactly the same: start with the c_0 case which modifies a Jacobi Jacobi matrix, then use QR in steps of two to construct all the even increments. Modify c_0 to c_0+1 via Cholesky, then use QR in steps of two to do the odd increments.SemiclassicalJacobi
and ZernikeAnnulus
that translates the above 1D results to the annulus OPs. The above Jacobi matrices are really multiplication in r
(r^2? Definitely some f(r)) but arguably we need x
and y
on the annulus. We also need the raising and lowering operators and partial derivatives. I don't know exactly how this part works yet, but it would seem logical that an interface serves us better here than reconstructing all these operators in 2D from scratch (since we know so much about the univariate radial OPs). I think this would be a pretty cool blueprint for other semiclassical bivariate OPs in the Koornwinder class.That sounds more like what my original plan was, since (1) and (2) align exactly with what I have been working on in my PR branches so far. I will now add the stuff Sheehan and John mentioned too and work on a SqrtWeightedPolynomial thing via QR - at least conceptually it is just as straightforward.
A question @MikaelSlevinsky: When constructing the hierarchy do you believe it numerically best to move up step by step like you explain rather than going the direct path? That is e.g. to get to (1,1,20), should we not go from (1,1,0) to (1,1,20) directly via QR rather than constructing all even substeps? It was my impression that the direct path is more stable but it's just based on a small number of experiments (and this may reverse if c is high).
SqrtWeightedPolynomials struct generically in ClassicalOPs.jl
Yes. Though just follow the convention (i.e. we have Weighted
so it should just be called SqrtWeighted
.
arguably we need x and y on the annulus
We did these for Zernike but they can wait: we can focus on operators that commute with rotations for now. (General operators should probably be pre-conditioned by these in an iterative method to retain computational cost.)
We also need the raising and lowering operators and partial derivatives
We don't want partial derivatives, we want the gradient. One can construct vector-analogues of Fourier modes (we are working this out on the disk): since the gradient is equavariant we would still have decoupling between Fourier modes.
That is e.g. to get to (1,1,20), should we not go from (1,1,0) to (1,1,20) directly via QR rather than constructing all even substeps?
We need (1,1,c)
for all c between 0 and 20 anyways.... so we should do it step-by-step.
Apart from the fact that we need all integer increments in c anyways, Kautsky and Golub have evidence that successive incrementation is more stable when the roots are near the interval of orthogonality (see the three tables in https://www.sciencedirect.com/science/article/pii/0024379583800287). In this sense, the worst case is when $\rho = 0$ and I test for that here against the Zernike Givens rotations which we have analytically test result log.
When the roots are far, it's okay (and more efficient) to not factorize the $\sqrt{u(x)}$ polynomial modification.
In this sense, we're lucky to know that $u(x)$ has a single root with high algebraic multiplicity, allowing us to do the increments stably. Otherwise, we'd have to factor it and the numerically computed roots tend to have error $O(\epsilon^{1/c})$ for a Chebyshev series of $(t-x)^c$.
I have a question about computing the Jacobi matrices of the semiclassical Jacobis via QR. There seems to be two ways to me, either via the Q or the R. Ive attached a small note. I believe @TSGut has implemented it via the R route. Since its triangular that makes sense but you do need to compute an inverse. @dlfivefifty @MikaelSlevinsky what are your thoughts?
You don't actually compute the inverse, you compute O(1) entries of the inverse applied to a vector. This is needed to get O(n) complexity.
Same is true for the Q
route. I suspect the Q
is more stable.
Yeah that's probably better 🍻👍. That the result is symmetric tridiagonal implies existence of an O(n) complexity algorithm with an error in the last row and column that are dropped we need smaller views as we increment up the hierarchy (same as with the R similarity).
Swapping it to Q
should be easy enough, I can just try and see how it fares stability wise.
I think this is fixed with the latest merge, please let me know if you find any bugs, instability or slow behavior in general.
@TSGut @dlfivefifty @MikaelSlevinsky I am witnessing instability in the lowering/raising when considering Semiclassical Jacobi with a large c parameter.
E.g. the following works:
ρ=0.02; t=inv(1-ρ^2);c = 5; SemiclassicalJacobi(t, 1, 1, c) \ SemiclassicalJacobi(t, 0, 1, c)
But the following fails (error given at the end):
ρ=0.02; t=inv(1-ρ^2);c = 100; SemiclassicalJacobi(t, 1, 1, c) \ SemiclassicalJacobi(t, 0, 1, c)
The reason I care is because I use these raising/lowering for the raising/lowering in annuli OPs. I have 3 questions1) Should we fix this? Or.. 2) Are we planning on changing the whole interface for the raising/lowering and I should just wait, Or.. 3) Should I be implementing the raising/lowering of the annuli OPs directly (via QR/Cholesky) and not be going via the semiclassical Jacobi?
Simply switching to
BigFloat
currently has its own issues. But maybe 4.4) Get the
BigFloat
implementation working.