vermaseren / form

The FORM project for symbolic manipulation of very big expressions
GNU General Public License v3.0
1.15k stars 136 forks source link

gcd_ gives wrong results #258

Closed spj101 closed 6 years ago

spj101 commented 6 years ago

It appears that when the gcd_ function is called on certain multi-variate expressions containing terms with rational coefficients the wrong result is given. The error depends on the order in which the symbols appearing in the expressions are defined.

Tested with form and tform on 26793e439166fe71722f7a59e3c2260801596ee8 with macOS 10.13.2 and Linux (openSUSE Tumbleweed 20180116).

Example:

#-
Off Statistics;

*Symbols m,s,t; * works
Symbols s,t,m; * fails

L test1= 1/5*s + 1/5*(s+t)*m;
L test2= (s+t)*m;
L result = gcd_(test1,test2); * expect result = 1;

print;
.end

Output:

FORM 4.2.0 (Dec 15 2017, v4.2.0-29-g26793e4) 64-bits  Run: Thu Jan 18 16:50:33 2018
    #-

   test1 =
      1/5*t*m + 1/5*s + 1/5*s*m;

   test2 =
      t*m + s*m;

   result =
      m;

  0.00 sec out of 0.00 sec
jodavies commented 6 years ago

The "works" ordering gives

==11536== Conditional jump or move depends on uninitialised value(s)
==11536==    at 0x4C1A48: GCDterms (ratio.c:2094)
==11536==    by 0x4C211E: GCDfunction3 (ratio.c:1179)
==11536==    by 0x4C3530: GCDfunction (ratio.c:1059)
==11536==    by 0x4B8F31: Generator (proces.c:3733)
==11536==    by 0x4B9473: Generator (proces.c:3931)
==11536==    by 0x4BA9F3: Processor (proces.c:404)
==11536==    by 0x438717: DoExecute (execute.c:838)
==11536==    by 0x44EF96: ExecModule (module.c:274)
==11536==    by 0x4B0A12: PreProcessor (pre.c:962)
==11536==    by 0x4EA969: main (startup.c:1605)
==11536== 
==11536== Conditional jump or move depends on uninitialised value(s)
==11536==    at 0x4C1A26: GCDterms (ratio.c:2095)
==11536==    by 0x4C211E: GCDfunction3 (ratio.c:1179)
==11536==    by 0x4C3530: GCDfunction (ratio.c:1059)
==11536==    by 0x4B8F31: Generator (proces.c:3733)
==11536==    by 0x4B9473: Generator (proces.c:3931)
==11536==    by 0x4BA9F3: Processor (proces.c:404)
==11536==    by 0x438717: DoExecute (execute.c:838)
==11536==    by 0x44EF96: ExecModule (module.c:274)
==11536==    by 0x4B0A12: PreProcessor (pre.c:962)
==11536==    by 0x4EA969: main (startup.c:1605)

whereas the "fails" ordering gives nothing, for me.

benruijl commented 6 years ago

Some more information: when using the "working order", tt2 is pointing to an entry in an uninitialized buffer, which causes the error. In the "working" order the value *tt2 is 0, and for the wrong order it is 22.

jodavies commented 6 years ago

I tested, but didn't find any cases, in which these issues (this and #260 ) mess up polyratfun. Can you confirm whether or not I should expect any changes in expressions making use of polyratfun?

benruijl commented 6 years ago

I'm afraid polyratfun is affected as well, at least for Bug #260. Bug #260 only happened when there was a univariate gcd in the first variable, which is rare.

This bug could occur when the numerator of the integer content is a small number, such that it matches the internal ID for symbols, vectors, etc. It is strange we didn't see this one before. You can test if this bug affects the polyratfun by putting a print in GCDterms. If it gets called during simplification of the polyratfun, you could be in trouble.

tueda commented 6 years ago

I think GCDterms(), which was fixed for this issue, is not called from PolyRatFun, right?

spj101 commented 6 years ago

Thanks for the fast response @benruijl and @tueda.

On 8b945c158973a41596ab02be81f9d672ea39f097 this bug appears to be fixed. We also re-ran the original calculation that exposed this bug and we are now getting reasonable results.