Powering of power series over Z/nZ

wbhart commented 6 years ago

Over Z/nZ it is possible to end up with greater precision when powering power series than when doing it over Z or some other integral domain.

This is because the binary exponentiation method, for example, may lead to intermediate results that have higher valuation than when working over an integral domain. For each additional zero coefficient, the absolute precision of the square increases as well.

Currently all the code in Nemo raises to a power by removing any valuation, treating the power series as absolute series with a fixed maximum precision (equal to the original relative precision), raising to the power as an absolute series (using binary exponentiation and polynomial mullow), then converting back to a relative power series and renormalizing.

This has the property that f^n = f.f.f...f (when comparing with isequal), but gives suboptimal performance and suboptimal precision.

We should make a decision as to whether to switch to using binary exponentiation with relative series as intermediate results, rather than first (notionally) converting to an absolute series.

If we decide to do this, then it will affect nmod_rel_series, fmpz_mod_rel_series and RelSeries.jl. It can be done without modifying Flint, which doesn't have relative series per se.

Note that the current test code for RelSeries.jl won't pass with isequal when comparing to f.f.f...f, but would still pass with == instead.

The advantage of switching would be better performance, higher precision and agreement with Sage for n != 0.

wbhart commented 6 years ago

A third option, suggested by Claus is to use bivariate polynomials when working over a residue ring. This would guarantee maximum possible precision.

E.g. f = a_0 + a_1x + ... + a_nx^n + O(x^{n+1}) would be represented as:

f~ = a_0 + a_1x + ... + a_nx^n + y*x^{n+1}

The precision of f^n is gained by finding the first term of (f~)^n containing a y.

The only way I can see to do this without affecting the performance of power series over Z/pZ would be to do the computation the usual way first and detect if the first coefficient that is expected to be nonzero is in fact zero. If so, we have to run the computation above using Claus' technique.

Unfortunately, this will be practically quite slow due to the bivariate polys, and hard to support in Flint due to the requirement for bivariate polys. But it is a theoretical option, in case there is some application for it.

fieker commented 3 years ago

1st: you can see this if you expand the power "symbolically" using the binomial/ multinomial theorem: some multinomial coefficients are just exact zero in pos. char, thus the error terms there do not matter. Hence (x + x^2 + O(x^4))^3 in char 3 increases precision.

2nd: I am inclined to say: nothing to do with binary powering - except in char 2. In general, in char p you're better of doing base-p powering, as power-by-char is for free and exact. For the powers with exponent < p you can use square-and-multiply (here binomial coeffs can never be exact zero) possibly supplemented with some caching of powers.

3rd: hypothetically, the same issue is with addition: x+x+x is not 3*x

fieker commented 3 years ago

Final point: im gemeral tracking precision in an optimal way is Xavier Caruso (et al)'s hobby. It can be done and is mathematically interesting, but too slow in general.

To me one should have

abs precision as S[[t]]/t^n:
- fast
- well defined
rel precision
- implemented in terms of abs precision
- still fast (not as fast as abs but manageable)
- well defined semantic
then if one needs more, on top of this
- Caruso (optimal precision tracking)
- Dokchister (lazy model)

wbhart commented 3 years ago

I'm happy to have optimal precision tracking as a different model, if we ever need it. No need to implement it if not.

fieker commented 3 years ago

On Mon, Mar 01, 2021 at 04:10:59AM -0800, wbhart wrote:

I'm happy to have optimal precision tracking as a different model, if we ever need it. No need to implement it if not.

I agree - and define

abs series as S[[t]]/t^n and not do anything fancy

rel series using easy semantic and leave the rest.

However, I'd say that wrt. to implementation in AA/Generic there should be an Array layer underpinning both polynomials and series, the same way this is handled in flint... Currently there is a vast amount of duplication in re-implementing in the multiplication as well as a total lack of fast products in the series.

-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/Nemocas/Nemo.jl/issues/319#issuecomment-787901373

wbhart commented 3 years ago

I'm not sure what an array layer would do that Julia arrays are not already being used for. Scalar multiplication?

Algorithms for series generally differ from those for polynomials.

I also don't think there is anything wrong with our relative series implementation. I see no reason to change it.

fieker commented 3 years ago

On Mon, Mar 01, 2021 at 04:42:00AM -0800, wbhart wrote:

I'm not sure what an array layer would do that Julia arrays are not already being used for. Scalar multiplication?

Algorithms for series generally differ from those for polynomials.

I also don't think there is anything wrong with our relative series implementation. I see no reason to change it.

Our series multiplication is only using classical multiplication. It should for lager precision at least use a mullow Karastuba. I'd say its save to assume that the generic stuff is not going to be used mainly with Z/pZ, but with expensive rings. Currently if I want to say support number fields I have to implement poly-over-number fields and series-over-number fields, the latter twice: rel and abs. However, the multiplication is mainly the same.

Why is flit using a vector_... interface underlying poly and series?

-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/Nemocas/Nemo.jl/issues/319#issuecomment-787919951

wbhart commented 3 years ago

Using algorithms of higher complexity in generics is problematic because of increased jit time and due to cutoffs being different per ring. I don't have any easy answers here.

Flint uses a vector interface because there is no Julia array type in C. It really doesn't do anything important. It's mainly just scalar operations (multiplication, division).

thofma commented 3 years ago

I don't think we need universal cutoffs, but it would be good to have the algorithms with better asymptotic complexity. But we don't we to duplicate everything for polynomials and series. So we could do the same as flint/Nemo do.

wbhart commented 3 years ago

Of course we can do the same as flint does, but flint has very few series functions and they are separate to the polynomial functions (even though they are in the same directories).

Yes, we could have different cutoffs for different rings. Of course you know my opinion on whether polynomials and the like should be done in C or not. Then the jit compilation is not relevant.

Calcium will soon have series over number fields by the way.

wbhart commented 3 years ago

What I suspect you are suggesting is that the generic series be implemented in terms of polynomials. That's fine, but the jit will bite hard. Do as you wish I guess.

thofma commented 3 years ago

No, no one is suggesting this.The suggestion is to have, for example, a mullow implementation working with arrays that can be used by both polynomials and series. This also has the advantage that it reduces jit time.

fieker commented 3 years ago

On Mon, Mar 01, 2021 at 06:22:40AM -0800, wbhart wrote:

Of course we can do the same as flint does, but flint has very few series functions and they are separate to the polynomial functions (even though they are in the same directories). but fmpz_abs_series is using fmpz_poly_mullow. I'd expect to see the same pattern in the generic code: series_mul -> mullow_pol The "best" way I can see that, also to avoid creating the parents, is to have an implementation based on arrays and the base_ring, not on the parent. Currently the (non-trivial!) mul has to be implemented many times (poly, rel_series, abs_series, even powering is done separately)

Yes, we could have different cutoffs for different rings. Of course you know my opinion on whether polynomials and the like should be done in C or not. Then the jit compilation is not relevant. But you will never write all polynomials in C, hence the generic code needs to be "fast" as well. Possibly with a generic function (to be implemented per ring) to set cutoffs.

I am using Poly{SeriesElem{qadic}} and I do not think that you'd automatically include this in c-implementations. Also: Poly{nmod__series} and Poly{fqnmodseries} where nmodseries is used as gfp_series are only generic.

The jit time is mostly, currently, nor relevant. It is way too long, but from an end-users point of view, the final version will have to be a total compile of the most important instances (say documentation and tests) on install. The install will then take 15 min, but the jit is gone.

If you're feeling lucky, try Oscar.build() or Hecke.build()....

Calcium will soon have series over number fields by the way.

Good on you, but until

we make Oscar depend on calcium
write Calcium_jll
write Calcium.jl
change the existing code to work with the "new" implementation I cannot use it. The same would be true for polynomials - which I am confident, will follow quickly, once calcium has the infrastructure. This is all cool stuff - but not available to me.

-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/Nemocas/Nemo.jl/issues/319#issuecomment-787986087

wbhart commented 3 years ago

poly mul is completely different to series mul. Of course you could implement series mul in terms of poly mullo, but it isn't used anyway, so there's no real gain except combining absolute and relative series. I don't see how this decreases jit time unless you happen to be using both types of series. Otherwise it will increase it.

And this is one function in a massive module. I don't see this as a significant advantage. But if you see something I don't, go ahead.

I can see an advantage in replacing the Nemo series with a single implementation parameterised by the ring. But it will slow down the Jit. Again, go ahead if you think the effort is worth it.

fieker commented 3 years ago

On Mon, Mar 01, 2021 at 07:08:05AM -0800, wbhart wrote:

poly mul is completely different to series mul. Of course you could implement series mul in terms of poly mullo, but it isn't used anyway, so there's no real gain except combining absolute and relative series. I don't see how this decreases jit time unless you happen to be using both types of series. Otherwise it will increase it.

And this is one function in a massive module. I don't see this as a significant advantage. But if you see something I don't, go ahead.

It's not ONE function - at least it should not.

we (at least I) need mullow for poly (also mulhigh - this is why generics exist in Hecke)
for large precision fast methods should be available. At least Kartsuba, but ideally more. Possibly also more _ks methods for series/ poly mixes
code duplication is always bad. If it reduces compile time, maybe, but generically it should increase jit times due to the same code compiled several times (for the same rings: poly mul and series mul over same ring)

I can see an advantage in replacing the Nemo series with a single implementation parameterised by the ring. But it will slow down the Jit. Again, go ahead if you think the effort is worth it.

Why would this be slower?

-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/Nemocas/Nemo.jl/issues/319#issuecomment-788020426

Nemocas / Nemo.jl

Powering of power series over Z/nZ #319