vermaseren / form

The FORM project for symbolic manipulation of very big expressions
GNU General Public License v3.0
985 stars 118 forks source link

Fix Issue 468 #517

Closed jodavies closed 1 month ago

jodavies commented 1 month ago

These two commits definitely need some more eyes. In particular, there are two more PutToMaster calls in EndSort that I did not touch yet, probably they need the sLevel check as well.

Please check the commit messages for details.

coveralls commented 1 month ago

Coverage Status

coverage: 48.763% (+0.004%) from 48.759% when pulling 9e40e24f178a4e69361e144c3deec4b2dd9310d6 on jodavies:issue-468 into e58ebd3f344ad3a6e3ae5393be3f20d17a79192f on vermaseren:master.

jodavies commented 1 month ago

I have a better reproducing case for #468 which produces the problem 100% of the time, which I got by printing the polys being added using gdb, in the frame of poly_ratfun_add, which cause the issue. The test file is 400KB, but it runs quickly (well under a second): @tueda , how much do you care about the actual size of fixes.frm ?

In principle one can try to construct a problematic poly add with artificially small buffer sizes, but this is never that easy to do...

tueda commented 1 month ago

I think a large test file is acceptable, but maybe someone's editor doesn't like it. I haven't tested but putting the data in a separate file and using #include Issue468.dat (#appendpath . may be needed?) would work. Or, nested folds (surrounding the lines for the big expression by a fold) would work for some editors.

jodavies commented 1 month ago

OK, I assumed that #include statement would work on the CI tests, since make check works for me...

tueda commented 1 month ago

We need to add Issue468.h to EXTRA_DIST in check/Makefile.am.

jodavies commented 1 month ago

I think I am happy with this now.

The PutToMaster calls in EndSort are already protected by the S!=AT.S0 check here (S==AT.S0 falls to the following else, which contains the calls) https://github.com/vermaseren/form/blob/e58ebd3f344ad3a6e3ae5393be3f20d17a79192f/sources/sort.c#L756

I updated the checks in MergePatches to use S==AT.S0 rather than S->sLevel for consistency.

I constructed a more compact test, so the large file in include is gone again.

I would certainly keep the first two commits separate (they fix different bugs). The test can be merged in with the second or not, as you prefer.

tueda commented 1 month ago

OK, I will merge this PR as it is. Thanks!

tueda commented 1 month ago

-w2 (i.e., w/o sortbots) still seems be broken:

./check/check.rb build/sources/tvorm -w2 -v 'Issue468'
Output term too large. Try to increase MaxTermSize in the setup.
Called from InFunction
Program terminating in thread 2 at 1.frm Line 28 -->

Increasing MaxTermSize to 244K gives a wrong result:

   test =
      prf( - 2*a^30 - 60*a^29*b - 60*a^29*c - 60*a^29*d - 870*a^28*b^2 - 1740*
      ...
      *c^4*d^26 - 8120*c^3*d^27 - 870*c^2*d^28 - 60*c*d^29 - 2*d^30 + 2,1);
jodavies commented 1 month ago

That’s weird, I thought I had tested w2 properly also. I’ll look into it.

It might be an idea to have some w2 tests in the GitHub actions set also, since indeed the sorting is different wrt the sortbots.

jodavies commented 1 month ago

The problem here is that MasterMerge (which is called only in tform when the sortbots are not active) calls CompareTerms here: https://github.com/vermaseren/form/blob/74735f57a7351e5c790648f2332d06c164d701d0/sources/threads.c#L3742

CompareTerms sets S->PolyWise to 1 (as it should), but S->PolyWise is 0 when we return to MasterMerge. Thus poly_ratfun_add is not called here: https://github.com/vermaseren/form/blob/74735f57a7351e5c790648f2332d06c164d701d0/sources/threads.c#L3764

In MasterMerge, S is AT0.SS (which is B0->T.SS), but at the start of CompareTerms it is set to AT.SS (which is B->T.SS).

There is a relevant comment: https://github.com/vermaseren/form/blob/74735f57a7351e5c790648f2332d06c164d701d0/sources/threads.c#L3739-L3741 and indeed if I put B0 back there, this example works with tform -w2...

To use the BHEAD macro there (for consistency with the rest of the code, I suppose) we need to not set *B=0 at the top of MasterMerge, but rather set it to B0? Edit: no, B is used to control the locks. We just have to put B0 as the argument, as it already is for AddArgs, AddRat etc. MergeWithFloat having argument BHEAD looks wrong too.

It is a bit surprising that this bug has not come up before. It is nothing to do with the size of the polynomials (as Issue #468 has been). The problem occurs for the test with N=1 already. I suppose tform -w2 is likely not commonly used, but still...

jodavies commented 1 month ago

OK, the bug was introduced in the FORM5 merge last year. Before that, Compare1 did not take a struct argument but called GETIDENTITY. Then everything was OK. When the struct argument was (re-)added, the wrong struct was passed.