Invalid read in Normalize

jodavies commented 6 years ago

Hello,

I have stripped the following out of a procedure which is doing series expansion of rational polynomials. The problems begin only after a certain value of DEPTH.

This example in fact does not crash for me, but gives the same valgrind errors as the "full" crashing example, which exits two modules later with Program terminating at prfexpand Line 96 -->.

#-
#: MaxTermSize 1M
#: WorkSpace 200M
Off Statistics;

#define DEPTH "12"

AutoDeclare Symbol n,m;
Symbol ep;
CFunction prfexpnum,prfexpden,prfexpdeno,prfexptmp;
Symbol o,i;

Local test10 =                                                                                           
       + prfexpdeno(m0,m1 + ep*m2 + ep^2*m3 + ep^3*m4 + ep^4*m5 + ep^5*m6)*ep^-1
      ;                                                                                                     
.sort

Identify,Once ep^o?*prfexpdeno(n?,m?) = ep^o * sum_(i,0,`DEPTH'-o, (-1*ep)^i*prfexptmp(m^i)*prfexpden(n^i) ) * prfexpden(n);
SplitArg ((ep)) prfexptmp;
Identify prfexptmp(n?,?a) = prfexpnum(n) + prfexptmp(?a);

.end

The valgrind spam:

==2040== Invalid read of size 4
==2040==    at 0x45CEE3: Normalize (normal.c:2481)
==2040==    by 0x4B7B20: Generator (proces.c:3064)
==2040==    by 0x4B9473: Generator (proces.c:3931)
==2040==    by 0x4B9009: Generator (proces.c:3763)
==2040==    by 0x4B9009: Generator (proces.c:3763)
==2040==    by 0x4B97AC: Generator (proces.c:4015)
==2040==    by 0x4B97AC: Generator (proces.c:4015)
==2040==    by 0x4B9473: Generator (proces.c:3931)
==2040==    by 0x4BF771: DoSumF1 (ratio.c:436)
==2040==    by 0x4B8D94: Generator (proces.c:3698)
==2040==    by 0x4B9009: Generator (proces.c:3763)
==2040==    by 0x4B9473: Generator (proces.c:3931)
==2040==  Address 0x5f3d454 is 1,012 bytes inside a block of size 1,728 free'd
==2040==    at 0x4C2A35C: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==2040==    by 0x50189E: M_free (tools.c:2363)
==2040==    by 0x50277C: FromVarList (tools.c:2778)
==2040==    by 0x450AFF: AddFunction (names.c:1228)
==2040==    by 0x4E8D08: StartVariables (startup.c:1047)
==2040==    by 0x4EA862: main (startup.c:1549)
==2040==  Block was alloc'd at
==2040==    at 0x4C29110: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==2040==    by 0x50182F: Malloc1 (tools.c:2236)
==2040==    by 0x5026AB: FromVarList (tools.c:2773)
==2040==    by 0x450AFF: AddFunction (names.c:1228)
==2040==    by 0x4E8D08: StartVariables (startup.c:1047)
==2040==    by 0x4EA862: main (startup.c:1549)
==2040== 
==2040== Conditional jump or move depends on uninitialised value(s)
==2040==    at 0x45CEE7: Normalize (normal.c:2481)
==2040==    by 0x4B7B20: Generator (proces.c:3064)
==2040==    by 0x4B9473: Generator (proces.c:3931)
==2040==    by 0x4B9009: Generator (proces.c:3763)
==2040==    by 0x4B9009: Generator (proces.c:3763)
==2040==    by 0x4B97AC: Generator (proces.c:4015)
==2040==    by 0x4B97AC: Generator (proces.c:4015)
==2040==    by 0x4B9473: Generator (proces.c:3931)
==2040==    by 0x4BF771: DoSumF1 (ratio.c:436)
==2040==    by 0x4B8D94: Generator (proces.c:3698)
==2040==    by 0x4B9009: Generator (proces.c:3763)
==2040==    by 0x4B9473: Generator (proces.c:3931)
==2040== 
  1.78 sec out of 1.78 sec

Thanks, Josh.

vermaseren commented 6 years ago

Hi Josh,

Obviously you are running out of maxtermsize and the reason is rather simple. Your prfexptmp contains powers of ep up to 64, which you may not want. Placing a reasonable cutoff in the declaration of ep (like `DEPTH'+4) should help a lot. That it runs a bit out of maxtermssize occasionally is hard to prevent without putting lots of testing statements all over the place (and slow things down).

Cheers

Jos

On 16 Feb 2018, at 11:34, Josh Davies notifications@github.com wrote:

-

: MaxTermSize 1M

: WorkSpace 200M

Off Statistics;

define DEPTH "12"

AutoDeclare Symbol n,m; Symbol ep; CFunction prfexpnum,prfexpden,prfexpdeno,prfexptmp; Symbol o,i;

Local test10 =

prfexpdeno(m0,m1 + epm2 + ep^2m3 + ep^3m4 + ep^4m5 + ep^5m6)ep^-1 ;
.sort

Identify,Once ep^o?prfexpdeno(n?,m?) = ep^o sum_(i,0,`DEPTH'-o, (-1ep)^iprfexptmp(m^i)prfexpden(n^i) ) prfexpden(n); SplitArg ((ep)) prfexptmp; Identify prfexptmp(n?,?a) = prfexpnum(n) + prfexptmp(?a);

.end

tueda commented 6 years ago

This breakage of the MaxTermSize check was caused by these commits. Looks like some implicit constraint conditions were broken. Before them, FORM correctly terminated with an error message:

(1)Output should fit inside a single term. Increase MaxTermSize?
Called from EndSort
Called from InFunction

git bisect:

BAD  *   36ecf57 [2015-04-12] Merge remote-tracking branch 'origin/gcd' @Ben Ruijl
     |\
GOOD | * 0cb5d3d [2014-10-06] Fixed a typo @Ben Ruijl
SKIP | * 483d98f [2014-10-06] Fixed the normalization in the top-level modular_gcd algorithm @Ben Ruijl
GOOD | * 58bb42d [2014-10-02] - Fixed some code in modular_gcd that prevented the selection of new   primes. This could cause an infinite loop - Simplified the mgcd algorithm a bit (and made it more similar to the   literature) @Ben Ruijl
GOOD | * 726f55d [2014-10-02] - Fixed the normalization in the gcd algorithm - Fixed the check to see if the current solution is the gcd @Ben Ruijl
BAD  * | bd36b3a [2015-04-01] Saner `DATE_' and `NTHREADS_' after .clear @Takahiro Ueda
BAD  * | 727b1c5 [2015-04-01] New preprocessor variable PID_ @Takahiro Ueda
BAD  * | 4fed881 [2015-03-09] Added spectator.c @vermaseren
SKIP * | efa8e07 [2015-03-09] Added dictionaries. Inverse PolyRatFun. More patterns with vectors. @vermaseren
BAD  * | 47aa1bb [2015-03-06] Repaired a bug in the interplay between tform and polyfun. @vermaseren
BAD  * | 5275487 [2015-01-29] and some repair of not including polyratfuninv @vermaseren
SKIP * |   1624032 [2015-01-29] Merge branch 'master' of github.com:vermaseren/form @vermaseren
     |\ \
BAD  | * | c3b18bb [2015-01-27] [parform] Fix bugs in InParallel @Takahiro Ueda
BAD  | * | 928e8c9 [2015-01-27] [parform] Fix a bug in disabling parallelization due to $-vars @Takahiro Ueda
BAD  | * | 1c081d2 [2015-01-23] [parform] Fix handling of zero expressions in InParallel @Takahiro Ueda
SKIP * | | d51d2ac [2015-01-29] Bug fixes in antisymmetric patternmatching and replace_ with many functions @vermaseren
     |/ /
BAD  * | b7b7bcb [2014-12-19] Repaired bug that did not treat argument that is zero correctly. @vermaseren
BAD  * | 3888b95 [2014-12-10] [doc] Fix the missing factor 4 in the trace with gamma5 (#28) @Takahiro Ueda
BAD  * | 0f5a9ac [2014-12-03] Revert the spectator stuff for now (#27) @Takahiro Ueda
SKIP * | 2444eda [2014-12-03] Set saner boundaries on function/term sortbuffers @vermaseren
GOOD * | 9275950 [2014-10-30] Allow `mu' as dict string via \`mu\'. @vermaseren
GOOD * | f07116c [2014-10-24] New command in transform. Lots of bugfixes. @vermaseren
     |/
GOOD * 57c183f [2014-09-23] Fixed the maximum string length (there was no room for a \0). This should be either 21 to support a 64 bit WORD and a \0 or 41 to support a 128 bit WORD and a \0. @Ben Ruijl

jodavies commented 7 months ago

It is not obvious to me that the original maxtermsize crash is really checking this explicitly. It is a par==1 EndSort, with lPatch>0. So newout is set to AllocFileHandle here: https://github.com/vermaseren/form/blob/b1f9041b7bb6f5dd4a149c87e70922ad9914b291/sources/sort.c#L738

Then you enter in: https://github.com/vermaseren/form/blob/b1f9041b7bb6f5dd4a149c87e70922ad9914b291/sources/sort.c#L855 https://github.com/vermaseren/form/blob/b1f9041b7bb6f5dd4a149c87e70922ad9914b291/sources/sort.c#L893 https://github.com/vermaseren/form/blob/b1f9041b7bb6f5dd4a149c87e70922ad9914b291/sources/sort.c#L922 and finally https://github.com/vermaseren/form/blob/b1f9041b7bb6f5dd4a149c87e70922ad9914b291/sources/sort.c#L949 which starts the crash.

After the change to the buffer sizes, there is never an lPatch, so you never end up here.

jodavies commented 7 months ago

I think a MaxTermSize of 1M is actually sufficient for this example. And setting it to, say, 20M does not help.

In fact, this example runs fine if the final Identify is commented, and this line is only reducing the size of the terms after the SplitArg.

jodavies commented 7 months ago

test-new.frm.txt test-new.h.txt

Here is a trimmed down example. I isolated a single bad term from the previous example. It is a function with >8500 arguments. Form loads it OK and sorts it in the first module, but the id statement crashes. Transform prfexptmp addargs(2,last); works OK.

If you remove a few of the arguments from the input (by defining OKBUTBADOUTPUT), it runs without crashing, with no valgrind errors, but the printed result is wrong. It looks a lot like the internal term representation?

test10 =
    + prfexptmp(14,0,12,1,8,20,12,21,1,26,1,13,1,3,14,0,12,1,8,20,11,21,2,
   26,2,78,1,3,14,0,12,1,8,20,12,22,1,26,2)
   ;

In the crashing case from valgrind I get

==2678043== Invalid read of size 4
==2678043==    at 0x193B5A: Normalize (normal.c:2544)
==2678043==    by 0x1FA307: Generator (proces.c:3272)
==2678043==    by 0x1FBF64: Generator (proces.c:4216)
==2678043==    by 0x1FD562: Processor (proces.c:406)
==2678043==    by 0x153330: DoExecute (execute.c:858)
==2678043==    by 0x1849BC: ExecModule (module.c:291)
==2678043==    by 0x1F2177: PreProcessor (pre.c:1041)
==2678043==    by 0x230967: main (startup.c:1688)

Also Identify prfexptmp(?a) = nargs_(?a); gives the wrong result. That one can be seen more easily:

#-
CFunction f;
Local test = f(1
    #do i = 2,`NARGS'
        ,`i'
    #enddo
    );
.sort
Identify f(?a) = nargs_(?a);
Print +s;
.end

This fails for NARGS=8192. Presumably there is some internal limit being hit here.

jodavies commented 7 months ago

Both the original code and the wrong nargs value are resolved (for this size of input) by making EATTENSOR larger in ftypes.h. I suppose there is a missing check and terminate in CheckWild that could catch this gracefully.

tueda commented 7 months ago

For the wrong nargs example, FORM works as follows: in the ArgAll block of MatchFunction():

https://github.com/vermaseren/form/blob/9637b9a8ba584fa9b10fef38b0939972e13357f6/sources/function.c#L1208-L1214

jumped from

https://github.com/vermaseren/form/blob/9637b9a8ba584fa9b10fef38b0939972e13357f6/sources/function.c#L1243

CheckWild(m[1],ARGTOARG,i=NARGS,t) returns "nihil obstat":

https://github.com/vermaseren/form/blob/9637b9a8ba584fa9b10fef38b0939972e13357f6/sources/wildcard.c#L1865-L1871

and in the next AddWild(m[1],ARGTOARG,i=NARGS):

https://github.com/vermaseren/form/blob/9637b9a8ba584fa9b10fef38b0939972e13357f6/sources/wildcard.c#L1563-L1567

the upper bits of the result are discarded by EATTENSOR = 0x2000 = 8192 (n is the result of nargs_).

AT.WildArgTaken is set in MatchFunction(): https://github.com/vermaseren/form/blob/9637b9a8ba584fa9b10fef38b0939972e13357f6/sources/function.c#L1160-L1165 so maybe we can raise an error when these values are >= EATTENSOR.

In practice, what could be a problem if we increase EATTENSOR to, say, 0x4000?

jodavies commented 7 months ago

I suppose both changes are needed, since one can always construct an example which hits the new limit. EATTENSOR would be better as a setup parameter, actually. Then FORM can crash with an error like "too many arguments in argument field wildcard" and the user can increase the maximum?

tueda commented 7 months ago

Now I think EATTENSOR is just a bit flag, which is passed as newnumber | EATTENSOR to CheckWild() or AddWild() only when type = ARGTOARG. If this is correct, then we can remove EATTENSOR by adding an additional argument for it to CheckWild()/AddWild() or passing the flag with type, like ARGTOARG | EATTENSOR. The latter may have a similar performance with the current implementation in the sense of the number of arguments passed to the functions (via the stack).

For reference, here is the ripgrep result for EATTENSOR with 9637b9a8ba584fa9b10fef38b0939972e13357f6:

function.c
1055:                           i = tobeeaten | EATTENSOR;

symmetr.c
802:                                            if ( CheckWild(BHEAD *p,ARGTOARG,funnycount|EATTENSOR,t) ) break;
803:                                            AddWild(BHEAD *p,ARGTOARG,funnycount|EATTENSOR);
960:                                            if ( CheckWild(BHEAD *p,ARGTOARG,wc|EATTENSOR,t) ) break;
961:                                            AddWild(BHEAD *p,ARGTOARG,wc|EATTENSOR);

wildcard.c
1563:                           if ( ( newnumber & EATTENSOR ) != 0 ) {
1564:                                   n = newnumber & ~EATTENSOR;
1758:           or-ed with EATTENSOR which is at least 8192.
1870:                                   if ( ( newnumber & EATTENSOR ) != 0 ) {
1871:                                           n = newnumber & ~EATTENSOR;

ftypes.h
782:#define EATTENSOR 0x2000

vermaseren / form

Invalid read in Normalize #268

-

: MaxTermSize 1M

: WorkSpace 200M

define DEPTH "12"