Closed sisong closed 4 years ago
carsh file download from : http://ftp.tsukuba.wide.ad.jp/software/gcc/releases/gcc-7.3.0/
decompress it and to tar: gcc-7.3.0.tar (686.3MB) file for test;
thanks sisong. I will attempt to reproduce the issue.
Any progress on this issue?
I was never able to reproduce the test file which @sisong described. The directions provided for reproducing the test file resulted in a tar which was no where near the size that the description suggested it should be.
Also, at the time the code would not have compiled on xcode so I assume there were modifications to the source to make it compile, however, those modifications were not provided.
Having said that, the latest master
branch should compile without modification for xcode. If @sisong can reproduce the test file that would help to resolve this issue.
Ok, thanks.
thanks for the reply!
I used new master code & 6 thread, same crash:
"Thread 4: EXC_BAD_ACCESS (code=2, address=0x7000079e3ff8)"
It should be caused by the recursive call to multikey_quicksort (crash at 1038 layer);
test case :
I bet that's a stack overflow due to excessive recursion. I'll take a look.
Confirmed. It's a stack overflow. I will push a change to use heap rather than the callstack for the quicksort as soon as I get a chance.
solved:
changed quicksort to use heap rather than callstack. In the long run this won't be an issue because induced sorting (the missing feature for v4) will solve this problem. However, algorithmically speaking, the best solution will be to move from a pure multikey quicksort to a multikey introsort where quicksort will eventually transition to heapsort in cases where the recursion depth becomes too deep. However, that makes the code more complex and, once induced sorting is added, would become an extremely unlikely corner cases anyhow.
@sisong @eerimoq Thanks for drawing attention to this open issue.
the usual approach is to sort smaller partition first, using recursion, and then go to the sorting of remainder without recursive call. this way you ensure that callstack size will be limited to log2(array size)
std::vector<stack_frame> stack;
xcode: Use of undeclared identifier 'stack_frame'
@sisong doh! Forgot to commit the header too. Fixed now
@Bulat-Ziganshin
In this is case, however, we are dealing with a multikey quicksort so the excessive recursion in this case is due to very large common suffix match lengths rather than being a matter of partition sizes.
Interesting to note, MSufSort does exactly the opposite in regards to partitions. The largest partitions are processed first. But this is done to minimize the chances of a single thread being assigned to process a large partition last while all other threads wait due to having no more partitions to sort.
My project used libdivsufsort, it runs on macos, windows, linux, android, etc.
Now I want to use msufsort4, compile and test with xcode;
Single-threaded ok, running at about the same speed as libdivsufsort;
Multi-Thread speed is faster than libdivsufsort, but crashes on some inputs(bigger data more likely,such as gcc source code tar);
Keeping your good work!