Open jodavies opened 1 month ago
It looks like parform
hangs after printing the error here.
Hmm...For some reason tform
on macos-11
doesn't print "Please increase SubSmallExtension setup parameter."
before terminating.
I have no way to test if it crashes for some other reason.
This may be irrelevant, but on MacOS (Sonoma 14.4.1
, Apple M2
, Apple clang version 15.0.0
), I got some malloc
errors for Issue512_2
:
./build/sources/vorm -D TEST=Issue512_2 check/fixes.frm
FORM 5.0.0-beta.1 (May 4 2024, v5.0.0-beta.1-57-g1a75289) Run: Sat May 11 17:47:37 2024
#ifndef `TEST'
#message Use -D TEST=XXX
#terminate
#else
#include `NAME_' # `TEST'
#-
Time = 0.00 sec Generated terms = 1
test Terms in output = 1
Bytes used = 20
Please increase SubSmallExtension setup parameter.
Program terminating at check/fixes.frm Line 2722 -->
0.12 sec out of 0.13 sec
vorm(25084,0x1fcafbac0) malloc: *** error for object 0x300000001: pointer being freed was not allocated
vorm(25084,0x1fcafbac0) malloc: *** set a breakpoint in malloc_error_break to debug
The corresponding Valgrind run on Linux is:
valgrind ./build/sources/vorm -D TEST=Issue512_2 check/fixes.frm
FORM 5.0.0-beta.1 (May 4 2024, v5.0.0-beta.1-57-g1a75289) Run: Sat May 11 17:55:02 2024
...
Time = 0.05 sec Generated terms = 1
test Terms in output = 1
Bytes used = 20
==118620== Invalid write of size 4
==118620== at 0x20C263: GarbHand (sort.c:3599)
==118620== by 0x20CA39: AddPoly (sort.c:2199)
==118620== by 0x20CD2F: SplitMerge (sort.c:3345)
==118620== by 0x20CBA2: SplitMerge (sort.c:3330)
==118620== by 0x20CBA2: SplitMerge (sort.c:3330)
==118620== by 0x20CBA2: SplitMerge (sort.c:3330)
==118620== by 0x20CBA2: SplitMerge (sort.c:3330)
==118620== by 0x20CBBB: SplitMerge (sort.c:3331)
==118620== by 0x20CBBB: SplitMerge (sort.c:3331)
==118620== by 0x20CBA2: SplitMerge (sort.c:3330)
==118620== by 0x20E7EB: StoreTerm (sort.c:4440)
==118620== by 0x1DF55F: Generator (proces.c:3366)
==118620== Address 0x4bf8a5b8 is 0 bytes after a block of size 22,857,080 alloc'd
==118620== at 0x4849839: malloc (in /home/linuxbrew/.linuxbrew/Cellar/valgrind/3.22.0/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==118620== by 0x22C03F: Malloc1 (tools.c:2322)
==118620== by 0x206335: AllocSort (setfile.c:916)
==118620== by 0x209261: NewSort (sort.c:641)
==118620== by 0x126FDB: CoIdExpression (comexpr.c:824)
==118620== by 0x12765A: CoId (comexpr.c:398)
==118620== by 0x13ADF2: CompileStatement (compiler.c:658)
==118620== by 0x1D76E1: PreProcessor (pre.c:1123)
==118620== by 0x214BAE: main (startup.c:1690)
==118620==
Please increase SubSmallExtension setup parameter.
Program terminating at check/fixes.frm Line 2722 -->
==118620== Invalid read of size 4
==118620== at 0x21443B: Terminate (startup.c:1792)
==118620== by 0x4BF78A6F: ???
==118620== by 0x20CA86: AddPoly (sort.c:2249)
==118620== by 0x20CD2F: SplitMerge (sort.c:3345)
==118620== by 0x20CBA2: SplitMerge (sort.c:3330)
==118620== by 0x20CBA2: SplitMerge (sort.c:3330)
==118620== by 0x20CBA2: SplitMerge (sort.c:3330)
==118620== by 0x20CBA2: SplitMerge (sort.c:3330)
==118620== by 0x20CBBB: SplitMerge (sort.c:3331)
==118620== by 0x20CBBB: SplitMerge (sort.c:3331)
==118620== by 0x20CBA2: SplitMerge (sort.c:3330)
==118620== by 0x20E7EB: StoreTerm (sort.c:4440)
==118620== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==118620==
6.78 sec out of 6.79 sec
==118620== Invalid free() / delete / delete[] / realloc()
==118620== at 0x484C89F: free (in /home/linuxbrew/.linuxbrew/Cellar/valgrind/3.22.0/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==118620== by 0x22C0B0: M_free (tools.c:2449)
==118620== by 0x1404D8: ClearSortGZIP (compress.c:731)
==118620== by 0x20FCDC: CleanUpSort (sort.c:4737)
==118620== by 0x2140EB: CleanUp (startup.c:1710)
==118620== by 0x21440E: Terminate (startup.c:1850)
==118620== by 0x214A53: onErrSig (startup.c:1558)
==118620== by 0x4D8E51F: ??? (in /usr/lib/x86_64-linux-gnu/libc.so.6)
==118620== by 0x21443A: Terminate (startup.c:1790)
==118620== by 0x4BF78A6F: ???
==118620== by 0x20CA86: AddPoly (sort.c:2249)
==118620== by 0x20CD2F: SplitMerge (sort.c:3345)
==118620== Address 0x1f00000098 is not stack'd, malloc'd or (recently) free'd
==118620==
==118620== Invalid free() / delete / delete[] / realloc()
==118620== at 0x484C89F: free (in /home/linuxbrew/.linuxbrew/Cellar/valgrind/3.22.0/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==118620== by 0x22C0B0: M_free (tools.c:2449)
==118620== by 0x1404E8: ClearSortGZIP (compress.c:732)
==118620== by 0x20FCDC: CleanUpSort (sort.c:4737)
==118620== by 0x2140EB: CleanUp (startup.c:1710)
==118620== by 0x21440E: Terminate (startup.c:1850)
==118620== by 0x214A53: onErrSig (startup.c:1558)
==118620== by 0x4D8E51F: ??? (in /usr/lib/x86_64-linux-gnu/libc.so.6)
==118620== by 0x21443A: Terminate (startup.c:1790)
==118620== by 0x4BF78A6F: ???
==118620== by 0x20CA86: AddPoly (sort.c:2249)
==118620== by 0x20CD2F: SplitMerge (sort.c:3345)
==118620== Address 0x41fffffff0 is not stack'd, malloc'd or (recently) free'd
==118620==
==118620== Syscall param unlink(pathname) points to unaddressable byte(s)
==118620== at 0x4E620AB: unlink (syscall-template.S:120)
==118620== by 0x4DADE6C: remove (remove.c:33)
==118620== by 0x20FCF4: CleanUpSort (sort.c:4741)
==118620== by 0x2140EB: CleanUp (startup.c:1710)
==118620== by 0x21440E: Terminate (startup.c:1850)
==118620== by 0x214A53: onErrSig (startup.c:1558)
==118620== by 0x4D8E51F: ??? (in /usr/lib/x86_64-linux-gnu/libc.so.6)
==118620== by 0x21443A: Terminate (startup.c:1790)
==118620== by 0x4BF78A6F: ???
==118620== by 0x20CA86: AddPoly (sort.c:2249)
==118620== by 0x20CD2F: SplitMerge (sort.c:3345)
==118620== by 0x20CBA2: SplitMerge (sort.c:3330)
==118620== Address 0x5 is not stack'd, malloc'd or (recently) free'd
Something wrong seems to be happening before calling Terminate()
...
I'll have a look at it. It might be related to https://github.com/vermaseren/form/issues/468#issuecomment-2104900922
The valgrind error here is due to GarbHand
making no attempt to avoid writing out of bounds. The old content of the small buffer, plus the stuff in SplitScratch, is copied back to the small buffer here:
https://github.com/vermaseren/form/blob/0112aa6b21b48616575c6cd205ec00c7797dbc93/sources/sort.c#L3581-L3611
The function certainly knows things have gone wrong; if you define TESTGARB
it prints "We are in deep trouble", and form is certainly going to Terminate
after return. But this can be made "valgrind clean" by comparing s2+i
with sTop2
in each iteration of the while loops (maybe just the second: the original content of the small buffer should certainly fit again), before the NCOPY
.
That we get a valgrind error here means that the content of the old small buffer + SplitScratch has blown through the small extension, and also through the PObuffer
, which follows the small buffer in the allocation of AllocSort
.
(maybe just the second: the original content of the small buffer should certainly fit again)
This is actually not the case here; it is the first loop which causes the problem. I don't understand why, though. The general idea is,
AddCoef
and AddPoly
, try to put a new term into the small bufferI would expect that the original terms pointed to by sPointer should be guaranteed to fit in the SmallBuffer+Extension?
I added some prints to try to work out what is going on here. For a "working" call of GarbHand
, I get something like the following:
sort.c AddPoly 2196, make new term (S->sFill 932624) (new term 7106) --> 939730
sort.c AddPoly 2196, make new term (S->sFill 939730) (new term 9192) --> 948922
sort.c AddPoly 2196, make new term (S->sFill 948922) (new term 12614) --> 961536
------Garbage collection-------
in: S->sFill = 948922, S->sTop2 = 960008
total = 926812, nterms = 0, SmallSize = 640004, SmallESize = 960008
out: S->sFill = 926812, S->sTop2 = 960008
so as soon as sFill plus the size of the new term is >= sTop2, GarbHand is called which clears out the gaps in the small buffer (this is all during a SplitMerge so terms are being merged a lot).
At the problematic call:
sort.c AddPoly 2196, make new term (S->sFill 959886) (new term 72) --> 959958
sort.c AddPoly 2196, make new term (S->sFill 959958) (new term 48) --> 960006
sort.c AddPoly 2196, make new term (S->sFill 960006) (new term 48) --> 960054
------Garbage collection-------
in: S->sFill = 960006, S->sTop2 = 960008
total = 982988, nterms = 0, SmallSize = 640004, SmallESize = 960008
==96593== Invalid write of size 4
==96593== at 0x221EB0: GarbHand (sort.c:3597)
==96593== by 0x22277E: AddPoly (sort.c:2206)
==96593== by 0x222B07: SplitMerge (sort.c:3321)
Here "total" is larger than sTop2 (I changed the print compared to the master branch, which prints 2*total for some reason). But sFill was 960006 before the call, and it was only trying to add a term of 48 words. So I am not sure how a "total" of 982988 is possible, unless the sPointer
array contains terms unaccounted for in sFill
-- as I understand, this should not be the case.
I am testing on top of #529 so writes over sTop2 are an immediate crash, unlike in the master branch which needs a write beyond PObuffer also (the test rebased onto master actually doesn't cause a valgrind error anymore, due to the increased value of SIOsize
from #515).
The issue is that "NEWSPLITMERGE" is not strict about setting pointers to zero, for terms which have merged (or cancel, but in this test nothing cancels). If we use the old code, GarbHand
correctly computes the size of the terms which remain.
Maybe this was done for speed, but what is means is that the "new" code fails in some cases where the old succeeds because GarbHand cannot work properly. If you run Issue512_2
with a SubSmallExtension
of 5090K, it will run to completion with the old code, but crash with the new because it determines (incorrectly) that SubSmallExtension is too small.. The Timsort
code is commented for this tests, but it doesn't change anything.
@tueda can you test this on your mac?
In case it is needed again, I used the following print code (and diff) in the two versions of SplitMerge
, to make sure the pointer array now looks the same in both cases.
// Print current state
printf("%4ld: ", number);
for (int ii = 0; ii < S->sTerms; ii++) {
if ( (S->sPointer)[ii] ) {
printf("X");
}
else {
printf(".");
}
}
printf("\n");
fflush(stdout);
Now it works on my MacBook Air.
OK, that is good. I have run a few mincer benchmarks and did not see any performance impact. I think in FORM the term comparison is so expensive that no amount of pointer operations during the sort is noticable.
I have a "re-written" version of SplitMerge
with some variable name changes and a bunch of commentary to make sure I understand it. Maybe the "improvement in the style of Timsort" can be further improved; at the moment it checks the right-hand half of the left array for the merge point of the right array with a binary search. In principle one could check the entire left array for the merge point, instead of just giving up immediately if the terms are not ordered at the midpoint. Maybe Jos already tested this at the time, and found it not to help? I will test this, but it would be for a new PR.
Previously, FORM warns only about SmallExtension, even if it is a sub-buffer sort.
This prompts the user to adjust the wrong buffer in the setup.
Fixes #512