Open 59428dd2-ef28-4cb1-9bba-5d31037b2661 opened 14 years ago
Description changed:
---
+++
@@ -3,3 +3,7 @@
http://groups.google.com/group/sage-devel/browse_thread/thread/843c17dcbd9c2958
I have a patch and a benchmark but need to redownload sage because I am getting unrelated doctest failures with or without the patch.
+
+EDIT: All tests pass now with the attached patch, as they should because the default behavior is not changed. Also, I am attaching a benchmark script using random rational expressions that simplify to 1. In this benchmark, the GiNaC option is about 10 times faster than the default option (Maxima's fullratsimp, without utilizing libraryness).
+
+One limitation of this patch is that it does not support Maxima's map option. GiNaC has a map function, but utilizing it from sage would require a bit more effort.
Attachment: trac_10268_enhance_simplify_rational.patch.gz
Attachment: test.ginsh.gz
Attachment: test.sage.gz
Okay, actually my patch does not work so well for the example that motivated it. I am now attaching a test.ginsh file that defines a rational expression, substitutes a variable with another big rational expression, calls normal and quits. The GiNaC shell finishes in about 1 minute on my laptop (in a shell execute: "time ginsh test.ginsh"). But after applying my patch to sage and then trying to do the equivalent thing via test.sage, it can go for hours without finishing. So, something is badly wrong, possibly my patch.
I'm not a Cython expert, but maybe should you only use the sig_on
business inside the else? It would make sense to start off with method='normal' and then do Ginac, otherwise do Maxima; there is no need to create the GEx
unless you are actually going to use it, I would think. I don't know if this would have anything to do with the hang, but it's worth a shot.
Replying to @kcrisman:
I am just learning sage, but it seems that the compiler does not like the GEx to be declared inside a conditional statement, which makes sense. The _sig_on and _sig_off thing I think is for catching segfaults, which doesn't seem to be a problem and when I comment those out, the behavior is the same. A slight possibility is the fact that when I use the GiNaC shell directly it is the most recent version, whereas Pynac forked off an older version, but the normal function has been in GiNaC for a long, long time.
More interesting is that when I interrupt sage, I get this traceback
KeyboardInterrupt Traceback (most recent call last)
/media/disk30/sage-4.6/<ipython console> in <module>()
/media/disk30/sage-4.6/local/lib/python2.6/site-packages/sage/symbolic/expression.so in sage.symbolic.expression.Expression.simplify_rational (sage/symbolic/expression.cpp:23989)()
/media/disk30/sage-4.6/local/lib/python2.6/site-packages/sage/symbolic/pynac.so in sage.symbolic.pynac.py_gcd (sage/symbolic/pynac.cpp:6440)()
/media/disk30/sage-4.6/local/lib/python2.6/site-packages/sage/rings/arith.pyc in gcd(a, b, **kwargs)
1363 sigma = Sigma()
1364
-> 1365 def gcd(a, b=None, **kwargs):
1366 r"""
1367 The greatest common divisor of a and b, or if a is a list and b is
/media/disk30/sage-4.6/local/lib/python2.6/site-packages/sage/interfaces/get_sigs.pyc in my_sigint(x, n)
7
8 def my_sigint(x, n):
----> 9 raise KeyboardInterrupt
10
11 def my_sigfpe(x, n):
KeyboardInterrupt:
Why would it fall into the gcd function from /media/disk30/sage-4.6/local/lib/python2.6/site-packages/sage/rings/arith.pyc? The patch does not call it directly, and it is a waste because normal in GiNaC already cancels the greatest common factor from the numerator and the denominator. And then a related question is why does gcd seem to hang?
Attachment: bench.sage.gz
I'm really happy to see some effort to use pynac/ginac to replace functionality we normally use maxima for. Unfortunately this is a really busy period for me so I can't help much. Thanks a lot for your effort Ben.
Replying to @sagetrac-bgoodri:
Why would it fall into the gcd function from /media/disk30/sage-4.6/local/lib/python2.6/site-packages/sage/rings/arith.pyc? The patch does not call it directly, and it is a waste because normal in GiNaC already cancels the greatest common factor from the numerator and the denominator. And then a related question is why does gcd seem to hang?
The numeric::gcd()
method calls sage.symbolic.pynac.py_gcd()
. See here:
http://pynac.sagemath.org/hg/file/b233d9dadcfa/ginac/numeric.cpp#l2526
It could be that our gcd() function doesn't work exactly like CLN's gcd()
function which is used originally by ginac. This would effect the termination criteria used in the multivariate gcd code in ginac/pynac.
I haven't looked into the functionality in normal.cpp
much, but one of William's goals was to make it call Singular (or the Factory library) to factor multivariate polynomials instead of the code in ginac. This library generally performs much better and it is actively being developed.
BTW, kcrisman was right about his comment on the use of sig_on/sig_off
. These functions are used to override the signal handlers so we can catch CTRL-C
in long running library code. You don't need them around the declaration of GEX
, but you should use them in the call to normal()
. AFAIR, Jeroen Demeyer recently wrote a section on this in the developers guide, but I don't have a link handy right now. The new call signature for these might be sig_on()
, like a function call.
Author: Ben Goodrich
Replying to @burcin: cc'ing William for clarification
I haven't looked into the functionality in
normal.cpp
much, but one of William's goals was to make it call Singular (or the Factory library) to factor multivariate polynomials instead of the code in ginac. This library generally performs much better and it is actively being developed.
Do you really want to do full factorization in simplify_rational()
. I think neither Maxima nor GiNaC do that, only square-free factorization and gcd cancellation. We could add an option to do full factorization of the numerator and denominator before returning. If so, would it make more sense to first backport the functionality in factor.cpp from GiNaC 1.5.x to the pynac fork than to code a pynac-libSingular link?
Also, I think this might be a bit separate from the issue I was hitting. When I ran test.sage last night under trace() with the enter key wedged down, by the morning it had called gcd() over 30,000 times and hadn't even passed the rational expression to GiNaC yet. This is a waste because GiNaC's normal() function was going to do 1 gcd cancellation anyway. So, it seems what we need is an option to prevent sage from trying to find the gcd of every subexpression.
The original bench.sage was not very appropriate because sage was simplifying the rational expression to 1 before passing it to Maxima or GiNaC. So the difference in speed primarily reflected the difference between interacting via pexpect and interacting via a library. The revised bench.sage avoids this and there is about a 7x speedup. However, sage is repeatedly calling gcd() automatically, and the performance would probably jump if we could avoid that somehow.
Replying to @burcin:
Replying to @sagetrac-bgoodri:
And then a related question is why does gcd seem to hang?
The
numeric::gcd()
method callssage.symbolic.pynac.py_gcd()
. See here: http://pynac.sagemath.org/hg/file/b233d9dadcfa/ginac/numeric.cpp#l2526 It could be that our gcd() function doesn't work exactly like CLN'sgcd()
function which is used originally by ginac. This would effect the termination criteria used in the multivariate gcd code in ginac/pynac.
Okay, I've been contributing to the confusion. Now I see what you meant: When sage calls Pynac's normal() function, Pynac calls "its" gcd() function, which is actually sage's gcd() function. So, all the calls to gcd() are expected behavior, and the question becomes why doesn't Pynac get them over with and terminate in 30 seconds like GiNaC does with the CLN implementation of gcd()?
For reference when I switch on the statistics bookkeeping in the (latest) GiNaC source:
goodrich@Y560:/tmp$ time ginsh test.ginsh
ginsh - GiNaC Interactive Shell (ginac V1.5.8)
__, _______ Copyright (C) 1999-2010 Johannes Gutenberg University Mainz,
(__) * | Germany. This is free software with ABSOLUTELY NO WARRANTY.
._) i N a C | You are welcome to redistribute it under certain conditions.
<-------------' For details type `warranty;'.
Type ?? for a list of help topics.
gcd() called 56331 times
sr_gcd() called 0 times
heur_gcd() called 1612 times
heur_gcd() failed 1 times
real 0m34.025s
user 0m33.245s
sys 0m0.602s
Okay, that is what is supposed to happen. With the current Pynac, I interrupt after an hour and
sage: quit;
Exiting Sage (CPU time 62m29.29s, Wall time 64m3.75s).
gcd() called 57537 times
sr_gcd() called 19 times
heur_gcd() called 1936 times
heur_gcd() failed 19 times
So, it looks as if Pynac is hanging at or toward the end, and it experiences many more failures in the heur_gcd() routine. I guess I should be looking at the gcd heuristics then. Any ideas come to mind?
Bah, it was a bug in GiNaC that was fixed by this recent GiNaC commit
http://www.ginac.de/ginac.git?p=ginac.git;a=commit;h=edf1ae46a926d0a718063c149b78c1b9a7ec2043
I can bring the bug back by reverting it.
However, the patch touches code that was only added to the 1.5.x branch of GiNaC, so we can't just apply that patch to the pynac spkg. I guess the logic that this patch fixes was also wrong somewhere in the 1.4.x branch of GiNaC. But I'm too tired and frustrated to look into it right now.
Replying to @sagetrac-bgoodri:
Replying to @burcin: cc'ing William for clarification
I haven't looked into the functionality in
normal.cpp
much, but one of William's goals was to make it call Singular (or the Factory library) to factor multivariate polynomials instead of the code in ginac. This library generally performs much better and it is actively being developed.Do you really want to do full factorization in
simplify_rational()
. I think neither Maxima nor GiNaC do that, only square-free factorization and gcd cancellation. We could add an option to do full factorization of the numerator and denominator before returning. If so, would it make more sense to first backport the functionality in factor.cpp from GiNaC 1.5.x to the pynac fork than to code a pynac-libSingular link?
Sorry for the confusion. I meant to say gcd.
Now that there is a separate ticket for the bug in the pynac gcd, #10284, I will post my response to the other questions there.
Pynac's normal()
was made accessible by Expression
in #12068, so this patch can take advantage of it. The gcd error can be worked around optionally with Pynac-0.6.7 and libgiac if it is installed (#20742). This speeds up the GCD by an order of mag versus Pynac and two orders of mag versus pexpect-Maxima.
Currently simplify_rational() only offers 3 Maxima methods. GiNaC offers another possibility via its normal() method. This issue is discussed here
http://groups.google.com/group/sage-devel/browse_thread/thread/843c17dcbd9c2958
I have a patch and a benchmark but need to redownload sage because I am getting unrelated doctest failures with or without the patch.
EDIT: All tests pass now with the attached patch, as they should because the default behavior is not changed. Also, I am attaching a benchmark script using random rational expressions that simplify to 1. In this benchmark, the GiNaC option is about 10 times faster than the default option (Maxima's fullratsimp, without utilizing libraryness).
One limitation of this patch is that it does not support Maxima's map option. GiNaC has a map function, but utilizing it from sage would require a bit more effort.
CC: @williamstein @eviatarbach
Component: symbolics
Author: Ben Goodrich
Issue created by migration from https://trac.sagemath.org/ticket/10268