Open asmeurer opened 8 years ago
I think we should simply describe each submodule, what algorithm it is using and cite literature. If it is not published, then we can describe more how things work.
For example, for the limits submodule, it will be a simple citation to literature.
sympy.tensor.tensor
==> Schreier Sims algorithm, Butler-Portugal algorithm.sympy.diffgeom
==> Functional differential geometry, Jack Wisdom & Gerald Jay SussmanIn the integrals module, Risch Integration Algorithm
as well. -
The sympy.poys.ring_series
module implements quite a few semi-numeric algorithms that aren't easy to find in literature. Should they be included?
I had a look at ring_series.py and I think all the algorithms there are standard (many are not even the fastest ones). The basics on power series manipulation are covered in sections 4.6 and 4.7 of Knuth vol 2, in Modern Computer Algebra (von zur Gathen and Gerhard), and in Modern Computer Arithmetic (Brent and Zimmermann). Newton's method, fast composition and differential equations are also covered in Brent and Kung's 1978 paper.
With this issue as well as with https://github.com/sympy/sympy-paper/issues/5, we need to figure out the scope of the paper. The paper could easily become a laundry list of features and it could also easily become a laundry list of algorithms. Both are aplenty. Personally, I'd like for a significant chunk of the paper to go into describing those things that are unique to SymPy, like its architecture, extensibility, and how it differs from other systems.
Yes. Agree on that @asmeurer .
How about this-:
I'd personally prefer that we first at-least list all of the major features of SymPy, and then explain in brief the algorithms that are involved with those major features. We must also, ensure that we give at-least 1 example related to that algorithm, now we can either give 1 example dedicated to a single algorithm, or also, we can implement different algorithms that come under a common feature, and list them under a common example. That'd also work.
But, I guess for certain important algorithms like those mentioned above by @Upabjojr , as well as others, we can give a dedicated example to them.
I'd also prefer that towards the end of the paper, we also discuss about how it differs from other systems
, extensibility
like @asmeurer pointed out.
I guess we can start working on the Paper Structure , and adding details to the same.
@asmeurer I wanted to work on the wiki, and I also want to contribute for certain algorithms related to Integration for this paper, as well. How should I go about them? Is there any write access permission? How can I edit the wiki - Paper Structure , to add my name?
Anybody should be able to edit the wiki.
@asmeurer It's not giving me any Edit option on that link. Is there any permission system that needs to be granted? Same can be seen in the screenshot below -: @asmeurer
Also, If I want to contribute by adding some files related to those algorithms, then how should I go about them?
I guess we must also add How is SymPy different from other CAS
in the Project Structure before Why SymPy
as well. @asmeurer
I wish to write about Integration from Risch
primarily, and also I wish to write about Group Theory
.
Are we considering to write about Group Theory
in the paper? @asmeurer
I guess the default wiki setting is to not allow anyone to edit. Good to know. I've fixed it.
Does SymPy use Borwein's factorial algorithm? Calculate the exponent of each prime up to $n$ of $n!$ then multiply them in one go. $O(p(n)log(n)log(p(n)))$ multiplies vs $O(n)$ naive. In large multiplies and divides you can get asymptotic performance wins by passing around exponent lists of prime factors.
As a reader of the paper I would like to know what SymPy's resource usage is for large calculations. At minimum a description of what BigNum packages are used under the hood.
For floating point error breakdowns, Gustafson has a nice writeup of Kahan's tests. I think showing the three he featured would be instructive without boring the reader.
Kahan Test 1: $E(0)=1$, $E(z) = {e^{z}-1 \over z }$ $Q(x) = |x - \sqrt{x^{2}+1}| - {1\over x+ \sqrt{x^{2}+1}$ $H(x)= E(Q(x)^{2})$ Test with $x = 15.0, 16.0, 17.0, 9999.0$. Answer should be 1 for each.
Kahan Test 2: Minimize $log(|3(1-x)+1|)/80 + x^{2} +1$ in $0.8 \leq x \leq 2.0$
Kahan Test 3: Rump's royal pain. An order 8 two variable equation.
factorial uses the Prime-Swing (as noted in the docstring). I don't know if that's the same as Borwein's algorithm. For big integers, we use Python's builtin integers, although the polys and mpmath optionally use gmpy if it is installed (we should probably mention this more explicitly). @fredrik-johansson would have to comment on the floating point bits.
What algorithms should we go into, and in what detail? Note that there are actually a lot of nontrivial algorithms in SymPy due to the polys (http://docs.sympy.org/latest/modules/polys/literature.html) (and a ton more if we include what's in mpmath). My thinking is that we shouldn't spend too much time describing algorithms that are already described in the literature. If there are things in SymPy that aren't based on an algorithm from the literature, we should describe that in more detail, at least if it's significant.