Open fingolfin opened 6 years ago
There is one thing we need to decide (and possibly change), and then there is some cleanup of the code (which I can do)
Generically, Gcd (or GcdOp) for two polynomials constructs the DefaultRing
and then redispatches to the three argument version. This ring construction can be costly and repetitive when working with polynomials. Furthermore, in cases such as polynomials the ring doesn't help. (Lets leave the question of Z[x_1,x_2] out for the moment as this is really unsupported and I an not eager to delve into it.)
Thus, for performance reasons, there are special methods for polynomial gcd and they are somewhat ad-hoc and need cleanup. (Thats why the methods appear for different number of arguments)
So what needs to be done here is either:
Neither seems a great idea to squeeze into 4.9.
=====
The second path you observe is actually a path for rational univariate polynomials -- according to the literature it is better to do some work with coefficients and modulo primes. I have never run timings to check whether there is a break-even point below which the generic method is better. If someone is interested in doing so, I'm happy to help with it and make subsequent changes to the code.
First off, it never intended this for 4.9, I fully agree it wouldn't be a good idea to squeeze any of this in there. -- we already release 4.9.0 and at this point, only bug fixes or really, really important changes should be applied to stable-4.9
. So this discussion is really for GAP 4.10.
Your explanation for on optimizing Gcd
in the 2-arg case make perfect sense, thank you. I think a little comment or two in the code, which point out that this is an optimization, would already help a lot with my concerns. Perhaps another comment near the method for rational univariate polynomials (something like: "this method is only for rational polynomials; it tries to avoid coefficient problems; for other coefficient rings, see functon X / the code in file Y"). So I really don't expect a major refactoring here, just some clarification (ideally recorded in form of source code comments for posterity).
As of the two options you list, I think I'd prefer the second one (if I understand it correctly), i.e. just install the 2-arg methods for GcdOp more systematically, so that in the example I give, the same Gcd code ends up being called (I don't have an opinion on which one).
As to the rational univariate polynomial Gcd algorithm: OK, I see how coefficint explosion etc. might be a problem, and that thus working mod primes and similar tricks could be useful. It still would be quite interesting to test this empirically. But I am of course not expecting you to do that now. But perhaps "somebody" will turn up who is interested to do this :-), or we can hire a student, or something...
[ Off-topic: To be honest, if I were to write code doing heavy polynomial computations, I personally would rather work on an interface on a system specialized for that (e.g. pick up work on SingularInterface
, help @markuspf with PariInterface
or @sebasguts with JuliaInterface
), as the they already contain state of art implementations of these and many more algorithms, and our polynomial data type really isn't suitable for many large scale computations to start with... ]
Anyway: if the existing code works, then I am not going to argue for removal. But, I stumbled over this because I was looking at code coverage reports, and at functions which never get called by our test suite. One of them is RPIGcd
. So I wonder if you know about any examples that actually exercise this code? Of course I could write tests which directly call it, but that's somewhat besides the point. That said, I also didn't really try hard to find examples, so perhaps it's just a matter of generating enough random polynomials and computing their gcds until I find some where the heuristics fail consistently.
@fingolfin As for cleaning up the no-ring gcd calls I've made a PR #2178 that I hope will deal with it.
Yes, it would be worth for someone to do performance tests. Is there a good place to note that down>
As for comparison with Singular and friends I agree that we cannot hope to play in the same league. At the same time, I think that the situation is similar as with integers. While we do not aim to be big-league integer factorization, we need some reasonable functionality in the system, as a basic version is used in so many places.
Similarly, factorization of rational polynomials (which is where RPIGcd gets used) is needed in places such as element order of matrices. It should get into play if a polynomial has a Galois group of larger permutation degree than necessary, e.g.
gap> f:=x^8-2;
gap> g:=GaloisSetResolvent(f,3);
gap> Factors(g);
I discovered by chance that there are two completely different code paths for
GcdOp
on univariate polynomials. One is triggered if you just give two polynomials; the other if you also specify the polynomial ring. I.e. if you run this example, two differentGcdOp
methods are executedThe first method is in
lib/ratfunul.gi
, and calls the global funcionGCD_UPOLY
, which in turn usesGcdCoeffs
, which in turn is an alias forGCD_COEFFS
, which is defined inlib/ratfun1.gi
, and essentially uses the Euclidean algorithm.The second method is in
lib/polyrat.gi
, line 722ff. It first tries a "heuristic method" by callingHeuGcdIntPolsCoeffs
, and if that fails, callsRPIGcd
, which seems to perform some kind of Hensel lifting.I am now wondering whether there is a deeper reason (i.e. other than history) for having these two different code paths? If so, it would be helpful to at least add comments to both places, mentioning the other. If not, perhaps we should (not today, but someday) perhaps work on unifying this into a single function?
I am hoping that @hulpke (who wrote most if not all of that code) can shed some light on this!