Open vneiger opened 7 months ago
PS: I "assign" myself to this, but anyone should feel free to handle this! I won't be able to work on it before mid-May at the earliest. What I might be able to do sooner than this is to write a prototype in SageMath; let me know if that can help.
This algorithm looks very promising! It'll also be interesting to see how the numerical stability compares for numerical/ball coefficients.
What I might be able to do sooner than this is to write a prototype in SageMath; let me know if that can help.
Yes, that would be useful.
A public domain C++ implementation: https://github.com/hly1204/library/blob/master/math/fps_composition.hpp
Here is a SageMath implementation. The core algorithm is at lines 102-133. There are some easy utility things at the beginning of the file, and all other things are either testing/timing functions, or verbose variants. Any comments/modifications welcome.
If you launch the file as such, it should launch a verbose variant in precision 10000 (allowing to see the degrees manipulated along the recursion, showing that this should be quasi-linear). As such the algorithm is not quasi-linear in Sage, I suppose due to the bivariate multiplications (I did not focus much on this since it would make more sense to optimize it in Flint rather than in Sage).
Oops, small fix for the output specification in the comment at line 85: it should read
return sum_{0 <= i < xhi, 0 <= j < yhi-ylo} s_{i, j+ylo} x^i y^j
Thanks! It looks straightforward to translate to C using nested gr_poly
for the bivariate polynomials. A generic mul_KS would be enough to make it quasilinear over all rings where we have quasilinear univariate multiplication.
gr_mpoly
ought to work as well, but that type is probably missing some utility functions necessary to make this convenient to implement right now. Eventually, it would make sense to have a dedicated type for dense bivariate polynomials.
Something to think about later is how to handle denominators optimally for compositions over Q.
Actually, since these bivariates are very regular, it makes sense to just represent them as 1D arrays, rearranging entries manually in this algorithm.
BTW, is ordinary Kronecker substitution the best way to multiply bivariates? Is there some way to reduce the zero padding?
On Friday, 19 April 2024 18:21:14 CEST Fredrik Johansson wrote:
BTW, is ordinary Kronecker substitution the best way to multiply bivariates? Is there some way to reduce the zero padding?
I suppose Harvey's trick from https://arxiv.org/abs/0712.4046 should apply.
-- Marc
Hello! I am one of the authors of this paper. Perhaps we could discuss how to implement it in FLINT together.
I believe there is an opportunity to further optimize the constant based on observations made in the paper by Bostan and Mori ("A simple and fast algorithm for computing the N-th term of a linearly recurrent sequence"). For example, the algorithm involves multiplying polynomials with Q(x) and Q(−x), and the FFT of Q(−x) can actually be directly obtained from the FFT of Q(x). I think these ideas could be combined with Harvey's trick.
Hi @EntropyIncreaser, and congratulations on this wonderful achievement!
I think it would make sense to start with a simple version for gr_poly
using the standard KS multiplication.
Unfortunately, we don't have a good generic FFT interface yet, though the question has come up before (#1459, #1418). It is possible to use the Schonhage-Strassen FFT (fft
) for big integer coefficients, fft_small
(or the future #2107) for small integer coefficients, and even acb_dft
for complex coefficients, but all of these will currently require their own interface code.
Recently a quasi-linear algorithm for composing power series, and also the related power projections, was announced: https://noshi91.hatenablog.com/entry/2024/03/16/224034
And there is now a more complete description on arXiv: https://arxiv.org/pdf/2404.05177.pdf
The algorithm seems rather easy to implement. At first reading, I do not see reasons to think that it will not be fast in practice. Well, at least in some contexts, since this may well depend on where the coefficients lie.
It would be nice to try this in Flint and see how it compares to the current
gr_poly_compose_series
.