vermaseren / form

The FORM project for symbolic manipulation of very big expressions
GNU General Public License v3.0
982 stars 118 forks source link

Segfault during optimisation of complicated expressions #447

Closed GraDje13 closed 4 months ago

GraDje13 commented 1 year ago

Sometimes a segfault occurs when applying optimization to complicated expressions. A simple working example of this is:


Local F = (x+y+z+a+b+c)^50;

.sort;
Format O1;
#Optimize F
#write<out_test_seg> "%O"

.end

The segfault does not happen if F is simplified, for example by lowering the power from 50 to 40. The segfault also occurs at the other optimization levels O2 and O3.

I also ran this program with gdb, which gave the following backtrace:

0x00007ffff796c6a1 in ?? () from /usr/lib/libc.so.6
(gdb) backtrace
#0  0x00007ffff796c6a1 in ?? () from /usr/lib/libc.so.6
#1  0x00005555555ee7f8 in Horner_tree (expr=<optimized out>, order=std::vector of length 6, capacity 6 = {...}) at optimize.cc:890
#2  0x00005555555f2de1 in try_MCTS_scheme (scheme=std::vector of length 6, capacity 6 = {...}, pnum_oper=pnum_oper@entry=0x7fffffffdbec) at optimize.cc:1971
#3  0x00005555555f2eca in find_Horner_MCTS_expand_tree () at optimize.cc:2047
#4  0x00005555555f32a4 in find_Horner_MCTS () at optimize.cc:2296
#5  0x00005555555f3681 in Optimize (exprnr=0, do_print=do_print@entry=0) at optimize.cc:4683
#6  0x00005555556262ee in DoOptimize (s=<optimized out>) at pre.c:6833
#7  0x000055555562bb89 in PreProInstruction () at pre.c:1233
#8  0x000055555562c788 in PreProcessor () at pre.c:1010
#9  0x000055555566af31 in main (argc=2, argv=0x7fffffffe068) at startup.c:1688
benruijl commented 1 year ago

The problem is that the workspace is too small, which causes a segfault. Setting

#:Workspace 2G

at the start of the file makes it work. Adding a check for the workspace size in Horner_tree is probably a good idea.

tueda commented 10 months ago

Actually, we have a check for the workspace

https://github.com/vermaseren/form/blob/8a37a42d449f19ad152783ca687d16b1a5864e36/sources/optimize.cc#L874-L887

but it seems not to work??

jodavies commented 4 months ago

Here, if I print sumsize I get 37533050 (words). WorkSize is the default 40000000 (words) so things are supposed to fit. But if I print *t in the loop which follows, and add up the values, they exceed 40M? Hence the segfault. But sumsize is supposed to already contain this sum...

Also, here the warning prints WorkSize in bytes, but this value is in words. MesWork in message.c also has the wrong units.

jodavies commented 4 months ago

This looks like #379 also.

tueda commented 4 months ago

Good catch! Thanks. I will merge #481.

Also, here the warning prints WorkSize in bytes, but this value is in words. MesWork in message.c also has the wrong units.

Right. AM.WorkSize is the workspace size in words:

https://github.com/vermaseren/form/blob/5dce05f81e267fe7eeefaad2d2fdd3a441b13519/sources/setfile.c#L447

Would you like to make a commit to fix these messages?