Closed jodavies closed 4 months ago
An experiment (though I don't know the meaning of this probablity distribution and also fluctuation uncertainty):
-w3 count=10000 good=9543 bad=457
-w4 count=10000 good=9869 bad=131
-w5 count=10000 good=9750 bad=250
-w6 count=10000 good=9578 bad=422
-w7 count=10000 good=9761 bad=239
-w8 count=10000 good=9815 bad=185
-w9 count=10000 good=9876 bad=124
-w10 count=10000 good=9885 bad=115
-w11 count=10000 good=9920 bad=80
-w12 count=10000 good=9932 bad=68
-w13 count=10000 good=9968 bad=32
-w14 count=10000 good=9957 bad=43
-w15 count=10000 good=9965 bad=35
-w16 count=10000 good=9989 bad=11
A slightly reduced version (test.frm
):
#-
Off stats;
CF prf;
S ep,n1,m4,m1,m2,m3,n2,n3,n4;
#include test.txt
*repeat id prf(m1?,m2?)*prf(n1?,n2?) = prf(m1*n1,m2*n2); * (1) doesn't change
ModuleOption noparallel;
.sort
PolyRatFun prf;
*ModuleOption noparallel; * (2) workaround
.sort
P;
ModuleOption noparallel;
.sort
#if termsin(diff7)
#terminate
#endif
ModuleOption noparallel;
.end
Usage: ./test.sh -w3
(NOTE: because FORM doesn't handle Ctrl+C
in the normal way, the above shell script is hard to stop by Ctrl+C
. You need to use Ctrl+Z
and then kill the job.)
In this version, the input contains just terms with ordinary symbols and functions. The problem occurs at the use of PolyRatFun
. So a workaround for now would be unfortunately Don't use PolyRatFun with three or more threads.
Somehow #:threadbucketsize 1
decreases the error rate, but doesn't remove the problem completely.
In fact, tform 4.1 at the v4.1 tag seems not to have any problems. I was using a network-wide 4.1 binary which has the tag "my_buggy-244-g6b38064" -- I cannot find this hash in the commit history? (Edit: dated 20161103)
I will try to bisect.
EDIT: I hit some commits that won't compile, and I need to finish some other stuff. I found, from good v4.1 and bad v4.2.0,
if you take those as good, one can find
Presumably the issue started around there somewhere...
EDIT2: I attached the log if you want to replay. bisect.log
Well, I would say a795a9c315dc75d1f5ca0f380becba8e2259d774 is the first bad commit in the sense that other bad commits (which fail the test, so both f5a5e82eec1798c15e1d5b7c3ef2787b9719125e and e55a108b2388b4d328dfcc384da700c8e044bd85 are bad) are descendants.
Around the bad commit:
BAD * e55a108 [2015-06-29] Improved memory usage of polygcd bracketing: brackets are now only generated when required. @Ben Ruijl
BAD * 56714e0 [2015-06-26] Merge branch 'master' of github.com:vermaseren/form @vermaseren
|\
BAD | * f5a5e82 [2015-06-26] Merge branch 'master' of github.com:vermaseren/form @Ben Ruijl
| |\
GOOD | * | 03debb8 [2015-06-26] - Fixed the bounds check for the gcd: now check each variable individually - The shape is now reset when we have found a gcd that is too large - Only if extra variables can be bracketed out, the gcd is called recursively @Ben Ruijl
BAD * | | 4f6cb55 [2015-06-26] 4 bugfixes (already!) @vermaseren
| |/
|/|
BAD * | 1a1db23 [2015-06-26] Added extra polyratfun options to manual @vermaseren
BAD * | a795a9c [2015-06-26] Added PolyRatFun divergence and expand. +bugfixes @vermaseren
GOOD * | 495aed3 [2015-06-24] - Reverted the end condition of the dense gcd routine. Seemingly the final condition in the Algorithms for Computer Algebra book is not good enough. @Ben Ruijl
|/
GOOD * f766b04 [2015-06-23] - Remove content in last variable before doing sparse gcd calculation. This improves performance and fixes a bug where a smaller divider is returned as the gcd - Refined the criteria for identifying bad gcd suggestions @Ben Ruijl
bisect.log:
# bad: [7599180782adfa9ca63b9d9cf1de0fd4ff590f21] Put in the fix and undid another
# good: [480ce79429a0c92a7da6c694dfe82d0b4afe48d0] Preparing for release 4.1
git bisect start 'HEAD' 'v4.1-20131025'
# bad: [24f17729ab5e4e9450ec650b57bf3feeca6edafd] Quick fix for crash by replace_(x,0)
git bisect bad 24f17729ab5e4e9450ec650b57bf3feeca6edafd
# bad: [cf7e7ec0c78a16dd2833c57f67b1496f9f6ed96b] [doc] Primitive support for LaTeX2HTML
git bisect bad cf7e7ec0c78a16dd2833c57f67b1496f9f6ed96b
# good: [928e8c9c221aecf1a04a8dde6f8fb2935d6e0710] [parform] Fix a bug in disabling parallelization due to $-vars
git bisect good 928e8c9c221aecf1a04a8dde6f8fb2935d6e0710
# good: [495aed3c300a3912ea94a87c3e2542f4a5ae856f] - Reverted the end condition of the dense gcd routine. Seemingly the final cond
git bisect good 495aed3c300a3912ea94a87c3e2542f4a5ae856f
# bad: [d40483a3a3ea215a8b2b9eec0287f57a2540aadd] - Zippel's algorithm only works for monic gcds. Multiplying by the lcoeff should f
git bisect bad d40483a3a3ea215a8b2b9eec0287f57a2540aadd
# bad: [56714e0ca76b019bd1f2ad6d17c8b07569382846] Merge branch 'master' of github.com:vermaseren/form
git bisect bad 56714e0ca76b019bd1f2ad6d17c8b07569382846
# bad: [4f6cb5513c4e508f87e1e134b711f28f839e9c64] 4 bugfixes (already!)
git bisect bad 4f6cb5513c4e508f87e1e134b711f28f839e9c64
# bad: [1a1db23f6ba869081d307f3a522add5961bc99b4] Added extra polyratfun options to manual
git bisect bad 1a1db23f6ba869081d307f3a522add5961bc99b4
# bad: [a795a9c315dc75d1f5ca0f380becba8e2259d774] Added PolyRatFun divergence and expand. +bugfixes
git bisect bad a795a9c315dc75d1f5ca0f380becba8e2259d774
# first bad commit: [a795a9c315dc75d1f5ca0f380becba8e2259d774] Added PolyRatFun divergence and expand. +bugfixes
Hello,
I have not (yet?) found a particularly minimal example for this issue, but I will post what I have so far.
With the following code, I am finding that in some* runs, the final result (which should be 0) is
or
The problem occurs in both tform and tvorm, 4,1, 4.2.0 and the latest git, and tvorm inside valgrind (but valgrind does not report any errors).
: problems start at -w3. For 3,5 threads there are problems ~ 2.4%,1.4% of the time. -w3 seems to produce only* the first output above. All other cases produce both. For even numbers of threads the problem is less frequent, more like 0.13% of runs.
The code is as follows. I have attached the necessary includes. They need renaming to .h.
test-expanded-tiny.txt test-result-tiny.txt
I will attempt to find a more trivial example later...
Thanks, Josh.