SciRuby / daru

Data Analysis in RUby
BSD 2-Clause "Simplified" License
1.04k stars 140 forks source link

SystemStackError for larger filters #240

Closed gnilrets closed 8 years ago

gnilrets commented 8 years ago

Something very bad is happening with filters. I recently upgraded a bunch of my code to 0.1.4 and I started getting SystemStackErrors with some of my larger dataframes. Here's how it can be reproduced:

df = Daru::DataFrame.new({ a: [1] * 140_000 })
df.where(df[:a].eq(1))
#=>SystemStackError: stack level too deep
v0dro commented 8 years ago

Having a look.

v0dro commented 8 years ago

This error usually happens if a recursion is involved, but there is none here. Investigating now.

v0dro commented 8 years ago

@zverok @mrkn have you come across something like this before?

v0dro commented 8 years ago

Interesting answer: http://stackoverflow.com/questions/11544460/how-to-get-a-backtrace-from-a-systemstackerror-stack-level-too-deep

v0dro commented 8 years ago

@gnilrets I just learned about caller.length from that answer and it appears that the problem happens at the Array#values_at function called here.

Upon testing it with a simple Array:

[3] pry(main)> a = [1] * 140_000
[4] pry(main)> a.values_at(1,2,3)
=> [1, 1, 1]
[5] pry(main)> a.values_at([1,2,3]*100_000)
TypeError: no implicit conversion of Array into Integer
from (pry):5:in `values_at'
[6] pry(main)> a.values_at(*[1,2,3]*100_000)
SystemStackError: stack level too deep
from (pry):6:in `__pry__'

@mrkn I think we just found a bug in Ruby. It's probably a recursive call internally in MRI that is causing this bug.