vermaseren / form

The FORM project for symbolic manipulation of very big expressions
GNU General Public License v3.0
1.13k stars 135 forks source link

Crash in transform mulargs #230

Closed jodavies closed 3 months ago

jodavies commented 6 years ago

Hello,

I am trying to use Transform f mulargs(1,last) to efficiently invert a FactArg statement. I find either a crash or segfault, depending on some buffer sizes. The issue is triggered by

#: MaxTermSize 1M
#: WorkSpace 400M

AutoDeclare Symbol x;
CFunction f;

Local test = f(x1*...*x3502);
.sort

FactArg f;
Transform f mulargs(1,last);
Print +s;
.end

To see the crash in vorm (all tests with 4.2.0) the number of arguments must be increased to 3504.

Under valgrind, for me vorm does NOT crash, but yields f(0) and gives

==30276== Warning: set address range perms: large range [0x1077a040, 0x2f2065c8) (undefined)
==30276== Conditional jump or move depends on uninitialised value(s)
==30276==    at 0x4B7B2F: Generator (proces.c:3072)
==30276==    by 0x4B9473: Generator (proces.c:3931)
==30276==    by 0x4BD86F: InFunction (proces.c:2046)
==30276==    by 0x4B8DED: Generator (proces.c:3759)
==30276==    by 0x4B9473: Generator (proces.c:3931)
==30276==    by 0x4BA9F3: Processor (proces.c:404)
==30276==    by 0x438717: DoExecute (execute.c:838)
==30276==    by 0x44EF96: ExecModule (module.c:274)
==30276==    by 0x4B0A12: PreProcessor (pre.c:962)
==30276==    by 0x4EA969: main (startup.c:1605)

Curiously, if I modify the definition of the test expression to, say,

CFunction g;
Local test = f(<g(x1)>*...*<g(x8750)>);

The code runs in form, but not in vorm...

Thanks, Josh.

EDIT: The SplitArgs/Transform f addargs(1,last) equivalent of this script appears to work without issues.

jodavies commented 6 years ago

Ah, I now see #183 which seems very similar. I do not see the same error, however.

In this example, one cannot use the Repeat Identify f(x1?,x2?,?x3) = f(x1*x2,?x3) construct without the workspace being very large.

tueda commented 6 years ago

Does factarg (without transform mulargs) already have some problem?

#:maxtermsize 1M
#:workspace 400M
Auto S x;
CF f;
L F = f(x1*...*x3504);
factarg f;
P;
.end
==31479== Conditional jump or move depends on uninitialised value(s)
==31479==    at 0x4BBC38: TestSub (proces.c:1612)
==31479==    by 0x4B6B81: Generator (proces.c:3052)
==31479==    by 0x4B811A: Generator (proces.c:3763)
==31479==    by 0x4B8535: Generator (proces.c:3931)
==31479==    by 0x4B9CB5: Processor (proces.c:404)
==31479==    by 0x437702: DoExecute (execute.c:838)
==31479==    by 0x44DB7F: ExecModule (module.c:274)
==31479==    by 0x4AFBD6: PreProcessor (pre.c:962)
==31479==    by 0x4E87E0: main (startup.c:1605)
==31479==
=== Workspace overflow. 400000000 bytes is not enough.

Increasing Workspace doesn't help.

jodavies commented 6 years ago

You are right, it does.

I first hit such a crash after I replaced in my code a working Repeat Identify f(x1?,x2?,?x3) = f(x1*x2,?x3) for a Transform f mulargs(1,last); however, in an effort to make the code faster.

I don't seem to be able to produce a minimal example where the Transform is causing the crash, and not the FactArg alone.

tueda commented 6 years ago

Indeed, even if I don't use factarg and transform mulargs, I get a similar error, though this may be another bug:

#:maxtermsize 1M
#:workspace 400M
Auto S x;
CF f;
L F = f(x1,...,x3504);
#do n={10,2}
  repeat id f(x1?,...,x`n'?,?a) = f(x1*...*x`n',?a);  * unrolling to save workspace
#enddo
P;
.end
==31959== Conditional jump or move depends on uninitialised value(s)
==31959==    at 0x4BBC38: TestSub (proces.c:1612)
==31959==    by 0x4B6B81: Generator (proces.c:3052)
==31959==    by 0x4B811A: Generator (proces.c:3763)
==31959==    by 0x4B8535: Generator (proces.c:3931)
==31959==    by 0x4B811A: Generator (proces.c:3763)
==31959==    by 0x4B8535: Generator (proces.c:3931)
==31959==    by 0x4B811A: Generator (proces.c:3763)
==31959==    by 0x4B8535: Generator (proces.c:3931)
==31959==    by 0x4B811A: Generator (proces.c:3763)
==31959==    by 0x4B8535: Generator (proces.c:3931)
==31959==    by 0x4B811A: Generator (proces.c:3763)
==31959==    by 0x4B8535: Generator (proces.c:3931)
==31959==
=== Workspace overflow. 400000000 bytes is not enough.
=== Change parameter WorkSpace in form.set

The invalid conditional jump is before the workspace overflow. Increasing the workspace to 800M gives F = f(0).

vermaseren commented 6 years ago

Hi,

The problem is NORMSIZE in the routine Normalize. This is set to 1000. In that case it allows up to 7*NORMALSIZE locations in the array psym, which needs two locations per symbol. Hence, if you have more than 3500 symbols in one term you have had it. Arguments count here as separate terms. Therefore your troubles occur only at the end.

Jos

On 14 sep. 2017, at 11:49, Takahiro Ueda notifications@github.com wrote:

Indeed, even if I don't use factarg and transform mulargs, I get a similar error, though this may be another bug:

:maxtermsize 1M

:workspace 400M

Auto S x; CF f; L F = f(x1,...,x3504);

do n={10,2}

repeat id f(x1?,...,xn'?,?a) = f(x1*...*xn',?a); * unrolling to save workspace

enddo

P; .end ==31959== Conditional jump or move depends on uninitialised value(s) ==31959== at 0x4BBC38: TestSub (proces.c:1612) ==31959== by 0x4B6B81: Generator (proces.c:3052) ==31959== by 0x4B811A: Generator (proces.c:3763) ==31959== by 0x4B8535: Generator (proces.c:3931) ==31959== by 0x4B811A: Generator (proces.c:3763) ==31959== by 0x4B8535: Generator (proces.c:3931) ==31959== by 0x4B811A: Generator (proces.c:3763) ==31959== by 0x4B8535: Generator (proces.c:3931) ==31959== by 0x4B811A: Generator (proces.c:3763) ==31959== by 0x4B8535: Generator (proces.c:3931) ==31959== by 0x4B811A: Generator (proces.c:3763) ==31959== by 0x4B8535: Generator (proces.c:3931) ==31959== === Workspace overflow. 400000000 bytes is not enough. === Change parameter WorkSpace in form.set The invalid conditional jump is before the workspace overflow. Increasing the workspace to 800M gives F = f(0).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/vermaseren/form/issues/230#issuecomment-329432281, or mute the thread https://github.com/notifications/unsubscribe-auth/AFLxEgBNk1R3WXcNSmDtCfe6_s0tskF3ks5siPaxgaJpZM4PXL3k.

jodavies commented 6 years ago

OK, then, I need to devise a better test. I originally had problems with arguments that were large multivariate polynomials in only a few symbols. I made the test code use many symbols simply because it is an easy way to do it.

jodavies commented 6 years ago

Here is an example,

#-
Off Statistics;

Symbol x1,...,x10;
CFunction f;

Local test = f((x1+...+x10)^5);
.sort

FactArg f;
*Repeat Identify f(x1?,x2?,?x3) = f(x1*x2,?x3);
Transform f mulargs(1,last);

Print +s;
.end

The FactArg and the Repeat Identify construction work with no problems with form and vorm and there are no valgrind errors.

If one enables instead the Transform, form and vorm crash, with valgrind errors:

==4645== Invalid read of size 4
==4645==    at 0x4E560F: EndSort (sort.c:939)
==4645==    by 0x50AFCE: RunMulArg (transform.c:2822)
==4645==    by 0x50BC79: RunTransform (transform.c:758)
==4645==    by 0x4B8AC0: Generator (proces.c:3634)
==4645==    by 0x4BA9F3: Processor (proces.c:404)
==4645==    by 0x438717: DoExecute (execute.c:838)
==4645==    by 0x44EF96: ExecModule (module.c:274)
==4645==    by 0x4B0A12: PreProcessor (pre.c:962)
==4645==    by 0x4EA969: main (startup.c:1605)
==4645==  Address 0x68 is not stack'd, malloc'd or (recently) free'd
==4645== 
Program terminating at test3.frm Line 15 --> 
==4645== Invalid read of size 4
==4645==    at 0x504A32: Crash (tools.c:3711)
==4645==    by 0x4EA1FD: Terminate (startup.c:1707)
==4645==    by 0x4EA80F: onErrSig (startup.c:1476)
==4645==    by 0x5BAA4AF: ??? (in /lib/x86_64-linux-gnu/libc-2.23.so)
==4645==    by 0x4E560E: EndSort (sort.c:913)
==4645==    by 0x50AFCE: RunMulArg (transform.c:2822)
==4645==    by 0x50BC79: RunTransform (transform.c:758)
==4645==    by 0x4B8AC0: Generator (proces.c:3634)
==4645==    by 0x4BA9F3: Processor (proces.c:404)
==4645==    by 0x438717: DoExecute (execute.c:838)
==4645==    by 0x44EF96: ExecModule (module.c:274)
==4645==    by 0x4B0A12: PreProcessor (pre.c:962)
==4645==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==4645== 
==4645== 
==4645== Process terminating with default action of signal 11 (SIGSEGV)
==4645==  Access not within mapped region at address 0x0
==4645==    at 0x504A32: Crash (tools.c:3711)
==4645==    by 0x4EA1FD: Terminate (startup.c:1707)
==4645==    by 0x4EA80F: onErrSig (startup.c:1476)
==4645==    by 0x5BAA4AF: ??? (in /lib/x86_64-linux-gnu/libc-2.23.so)
==4645==    by 0x4E560E: EndSort (sort.c:913)
==4645==    by 0x50AFCE: RunMulArg (transform.c:2822)
==4645==    by 0x50BC79: RunTransform (transform.c:758)
==4645==    by 0x4B8AC0: Generator (proces.c:3634)
==4645==    by 0x4BA9F3: Processor (proces.c:404)
==4645==    by 0x438717: DoExecute (execute.c:838)
==4645==    by 0x44EF96: ExecModule (module.c:274)
==4645==    by 0x4B0A12: PreProcessor (pre.c:962)
jodavies commented 5 months ago

The crash here is due to landing on https://github.com/vermaseren/form/blob/e99543ad939ef823f5e8db0678aa18028e7de707/sources/sort.c#L949 but newout is 0. This is a par=0 sort, so EndSort has not allocated a file handle as it does for par==1 or 2.

Edit: why is the output of the transform statement not using a par=1 sort? I don't fully understand EndSort, but from what I can tell if you do a par=0 sort with sLevel > 0, it will never be able to merge large patches.

If I blindly change the EndSort in RunMulArg to par=1, things seem to run fine...