JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.73k stars 5.49k forks source link

sortperm ERROR: stack overflow #8347

Closed paulanalyst closed 9 years ago

paulanalyst commented 10 years ago

Version 0.3.0 (2014-08-20 20:43 UTC) Official http://julialang.org/ build x86_64-w64-mingw32

julia> test3=vec(readcsv("test3.txt")) 5384392-element Array{Float64,1}: julia> p=sortperm(test3) ERROR: stack overflow

julia>

File with vector after rar has 21MB and is too bigg to atach. Where I can put this vecor for Yours tests? Paul

ViralBShah commented 10 years ago

Cc: @vtjnash

tkelman commented 10 years ago

Dropbox? Google drive? Skydrive? etc

paulanalyst commented 10 years ago

Is sent to You via google disk. Paul

tkelman commented 10 years ago

Is sent to You via google disk.

Not sure what you mean there, I didn't get anything. A public link in the comments here would be best.

paulanalyst commented 10 years ago

https://drive.google.com/file/d/0B9xW5VtANWhDc0pPVWpZT0taSHM/edit?usp=sharing

W dniu 2014-09-14 o 15:38, Tony Kelman pisze:

Is sent to You via google disk.

Not sure what you mean there, I didn't get anything. A public link in the comments here would be best.

— Reply to this email directly or view it on GitHub https://github.com/JuliaLang/julia/issues/8347#issuecomment-55525799.

tkelman commented 10 years ago

I can reproduce this using 0.3.0, but not with latest 0.4.0-dev. Can you try the most recent nightly from http://status.julialang.org/download/win64 ? It's almost 3 weeks old now, not sure whether it will contain whatever the fix was. @ihnorton is automated building of the Windows nightlies hung up on something?

tkelman commented 10 years ago

@paulanalyst whoops, sorry, nevermind, don't download the version that says commit c6036da, will need something newer than that. Check http://status.julialang.org/ until you see a newer date for "win64 nightly last built"

paulanalyst commented 10 years ago

Ok, THX, I am waiting P. W dniu 2014-09-14 o 19:49, Tony Kelman pisze:

@paulanalyst https://github.com/paulanalyst whoops, sorry, nevermind, don't download the version that says commit c6036da https://github.com/JuliaLang/julia/commit/c6036da1759562282413480eda4e6865bfb905dc, will need something newer than that. Check http://status.julialang.org/ until you see a newer date for "win64 nightly last built"

— Reply to this email directly or view it on GitHub https://github.com/JuliaLang/julia/issues/8347#issuecomment-55533278.

tkelman commented 10 years ago

According to bisect this was fixed on master by 6667e5d788ae0b19766f29e4518afb99ff27bd8e. That was the commit that started causing master to segfault for about a week before it was fixed. There's a series of related commits there, it seems like things are relatively stable now. @JeffBezanson - thoughts on whether that series of commits should be backported to release-0.3 to fix this bug there too? I'm not sure what about sortperm is related to those changes.

paulanalyst commented 10 years ago

No sortperm in this week ? Paul

W dniu 2014-09-15 o 10:08, Tony Kelman pisze:

According to bisect this was fixed on master by 6667e5d https://github.com/JuliaLang/julia/commit/6667e5d788ae0b19766f29e4518afb99ff27bd8e. That was the commit that started causing master to segfault for about a week before it was fixed. There's a series of related commits there, it seems like things are relatively stable now. @JeffBezanson https://github.com/JeffBezanson - thoughts on whether that series of commits should be backported to release-0.3 to fix this bug there too? I'm not sure what about sortperm is related to those changes.

— Reply to this email directly or view it on GitHub https://github.com/JuliaLang/julia/issues/8347#issuecomment-55562268.

JeffBezanson commented 10 years ago

It's hard to see why that change would fix a stack overflow in sortperm. Did you get a stack trace? Where is the stack overflow exactly? Anyway that change should not be backported.

paulanalyst commented 10 years ago

this vector is always stoped proces :

https://drive.google.com/file/d/0B9xW5VtANWhDc0pPVWpZT0taSHM/edit?usp=sharing

            _
_       _ _(_)_     |  A fresh approach to technical computing

() | () () | Documentation: http://docs.julialang.org | |_ | Type "help()" for help. | | | | | | |/ ` | | | | || | | | (| | | Version 0.3.0 (2014-08-20 20:43 UTC) / |_'||_|'_| | Official http://julialang.org/ build |__/ | x86_64-w64-mingw32

julia> temp3=vec(readcsv("test3.txt")) 5384392-element Array{Float64,1}: 1.03896 ... 0.282558

julia> sortperm(temp3) ERROR: stack overflow

julia>

Do You have the file ? Paul

W dniu 2014-09-16 o 16:35, Jeff Bezanson pisze:

It's hard to see why that change would fix a stack overflow in sortperm. Did you get a stack trace? Where is the stack overflow exactly? Anyway that change should not be backported.

simonster commented 10 years ago

@JeffBezanson Maybe that reduced the footprint of Perm (and avoids passing MergeSortAlg in the call) so that the recursion in sort! is no longer overflowing the stack?

paulanalyst commented 10 years ago

W dniu 2014-09-16 o 19:32, Simon Kornblith pisze:

@JeffBezanson https://github.com/JeffBezanson Maybe that reduced the footprint of Perm https://github.com/JuliaLang/julia/blob/1ee440bee5035ccb33f82b8a45febddd2f973baa/base/ordering.jl#L38 so that the recursion in |sort!| is no longer exhausting the stack?

I have a suspicion that stack overflow is when the vector is pretty much identical numbers (wheel yourself or scattered). Multiplying * 10 ^ 9 and conversion to Int sometimes helps but rarely. Paul

tkelman commented 10 years ago

Did you get a stack trace?

Nope, no backtrace whatsoever.

Where is the stack overflow exactly?

No idea. Couldn't reproduce with the same file on Linux. I think the default stack size is 1/8th the size on Windows as it is on Linux, we add some flags to help with that when linking but maybe they aren't applying at runtime?

simonster commented 10 years ago

If I read the assembly correctly it also looked like the function reserved 72 bytes of stack on Windows vs. 24 on *nix.

pao commented 10 years ago

Couldn't reproduce with the same file on Linux. I think the default stack size is 1/8th the size on Windows as it is on Linux

Maybe try setting a smaller stack with ulimit -s?

vtjnash commented 10 years ago

simonster's analysis sounds quite likely. we use the same stack size on windows as linux (via linker flags).

Did you get a stack trace?

window's needs a lot of stack space to generate a stack trace and doesn't allow you to do it on alternate stack (without a small amount of work to alter the stack frame manually, ala switch_stack), so it's a double-whammy with the net effect that I disabled stack trace generation for stack overflows on windows

toivoh commented 10 years ago

Just speculating here, but could we be hitting a degenerate case where quicksort gets an O(n) recursion level?

tkelman commented 10 years ago

And looks like Isaiah successfully kicked things and got them back running, so now there's an 0.4-dev nightly you can use that should be free of this bug: http://status.julialang.org/download/win64

paulanalyst commented 10 years ago

I am analyst only, not programers ;)

  1. Is any soft to save/view stack trace?
  2. Are we findig 'after-you-after-you blocking' or somethink lik that ?
  3. Do I have to check this on Linux on the same code? Paul

W dniu 2014-09-17 o 05:52, Tony Kelman pisze:

Did you get a stack trace?

Nope, no backtrace whatsoever.

Where is the stack overflow exactly?

No idea. Couldn't reproduce with the same file on Linux. I think the default stack size is 1/8th the size on Windows as it is on Linux, we add some flags to help with that when linking but maybe they aren't applying at runtime?

paulanalyst commented 10 years ago

|sorry ;) but ulimit-s where? Win? Julia ? Paul

|W dniu 2014-09-17 o 06:05, pao pisze:

Couldn't reproduce with the same file on Linux. I think the
default stack size is 1/8th the size on Windows as it is on Linux

Maybe try setting a smaller stack with |ulimit -s|?

StefanKarpinski commented 10 years ago

As a workaround for this issue, which I think is the most pressing matter (from your perspective), @paulanalyst, can you try sortperm(test3, alg=MergeSort) instead? The different sorting algorithm may avoid the stack overflow. Alternatively, install the SortingAlgorithms package and try using SortingAlgorithms and then do sortperm(test3, alg=HeapSort). In any case, the main issue here seems to be that this may be that for lots of similar values QuickSort behaves badly. I'll investigate. The fact that the stack happens to be big enough on some systems is sort of a side issue.

StefanKarpinski commented 10 years ago

cc: @illerucis, @kmsquire

pao commented 10 years ago

but ulimit-s where? Win? Julia ?

That was intended for @tkelman to try to reproduce the stack overflow on Linux.

illerucis commented 10 years ago

hmm we can try a different pivot selection but I think median-of-three should still recurse like logN in this case, unless I'm missing something

paulanalyst commented 10 years ago

Hi In Version 0.4.0-dev+626 Is beter. I can do sortperm on the sample test3.txt. Thx! But i have a new vector (test4.txt) who jamed sortperm (link down), Please help ! My research on hold ;) link to test4: https://drive.google.com/file/d/0B9xW5VtANWhDNk9pWnVWbW1xb28/edit?usp=sharing proces: () | A fresh approach to technical computing () | () () | Documentation: http://docs.julialang.org | |_ | Type "help()" for help. | | | | | | |/ ` | | | | || | | | (| | | Version 0.4.0-dev+626 (2014-09-17 04:14 UTC) / |_'||_|'_| | Commit 97f6642 (1 day old master) |__/ | x86_64-w64-mingw32

julia> test4=vec(readcsv("test4.txt")) 5931309-element Array{Float64,1}: -0.646265 3.28372 ? 0.293165 -0.678435

julia> p=sortperm(test4) ERROR: stack overflow

Paul

paulanalyst commented 10 years ago

@ Stefan sortperm(test4, alg=MergeSort) is working, thx, Paul

illerucis commented 10 years ago

will dig into this tonight.

vtjnash commented 10 years ago

If I reduce the stack space to 2MB (ulimit -s 2000), linux hits a stack overflow also (on test3.txt)

paulanalyst commented 10 years ago

ERROR: stack overflow

Unfortunatly sortperm still do not work on long vectors with many the same values : () | A fresh approach to technical computing () | () () | Documentation: http://docs.julialang.org | |_ | Type "help()" for help. | | | | | | |/ ` | | | | || | | | (| | | Version 0.4.0-dev+1130 (2014-10-18 02:03 UTC) / |_'||_|'_| | Commit f4cc46f* (0 days old master) |__/ | x86_64-w64-mingw32

julia> y=vec(readcsv("y.txt")) 6582876-element Array{Float64,1}: 0.000421203 ... 0.000421203 0.000421203 -0.0446317

julia> p=sortperm(y) ERROR: stack overflow

julia>

file (vector) is here https://drive.google.com/file/d/0B9xW5VtANWhDcHdoYWtISGFUWjg/view?usp=sharing Paul

andreasnoack commented 10 years ago

It works on my mac with latest master. I don't have access to a Windows machine right now. Could someone with a Windows machine try this out?

Med venlig hilsen

Andreas Noack

2014-10-18 3:46 GMT-04:00 paulanalyst notifications@github.com:

Unfortunatly sortperm do not work on long vectros with many the same values () | A fresh approach to technical computing () | () ( ) | Documentation: http://docs.julialang.org http://docs.julialang.org | |_ | Type "help()" for help. | | | | | | |/ ` | | | | || | | | ( | | | Version 0.4.0-dev+1130 (2014-10-18 02:03 UTC) / |_'||_|' | | Commit f4cc46f https://github.com/JuliaLang/julia/commit/f4cc46f84b0e38daef3e96e51e166d84d90cfe73 (0 days old master) |_*/ | x86_64-w64-mingw32

julia> y=vec(readcsv("y.txt")) 6582876-element Array{Float64,1}: 0.000421203 ... 0.000421203 0.000421203 -0.0446317

julia> p=sortperm(y) ERROR: stack overflow

julia>

file (vector) is here

https://drive.google.com/file/d/0B9xW5VtANWhDcHdoYWtISGFUWjg/view?usp=sharing Paul

— Reply to this email directly or view it on GitHub https://github.com/JuliaLang/julia/issues/8347#issuecomment-59602405.

vtjnash commented 10 years ago

i have a non-stack (heap) based version of mergesort somewhere around (@haykyegoryan)

someone would need to write a non-stack (heap) based version of quicksort to resolve this particular issue however

kmsquire commented 10 years ago

As a workaround, you could use alg=MergeSort, possibly replacing MergeSort with one of the sorts in SortingAlgorithms.jl (e.g., HeapSort, TimSort).

paulanalyst commented 10 years ago

alg=MergeSort works but plot(diff(y2[p])) sugest that somthing wrong...

julia> y2=vcat(y,y,y,y,y);

julia> p=sortperm(y2,alg=MergeSort,rev=true);

julia> plot(diff(y2[p]))

julia>

W dniu 2014-10-20 o 06:36, Kevin Squire pisze:

As a workaround, you could use |alg=MergeSort|, possibly replacing MergeSort with one of the sorts in SortingAlgorithms.jl (e.g., HeapSort, TimSort).

— Reply to this email directly or view it on GitHub https://github.com/JuliaLang/julia/issues/8347#issuecomment-59682926.