JuliaData / JuliaDB.jl

Parallel analytical database in pure Julia
http://juliadb.org/
Other
768 stars 62 forks source link

ERROR: OutOfMemoryError() when creating a large table #242

Open skanskan opened 5 years ago

skanskan commented 5 years ago

How can I create a large table using JuliaDB?

I would like to be able to perform operations such as these:

using DataFrames
N=3
myDT = DataFrame(group = repeat('A':'C',outer=N), x = 1:(3*N) ) # create a dataframe
myDT.y = myDT.x .* rand(3*N) # add a new column z
myDT[myDT.group .== 'A', :y] = 0 # Replace y values when group == 'A'

With JuliaDB

N=10^9
table((group = repeat('A':'C',outer=N), x = 1:(3*N) ))
but it consumes all my RAM and produces the error
ERROR: OutOfMemoryError()

The docs show how to load data from csv files but I haven't seen how to the use out-of-core functionality to create tables (or whatever structure) larger than memory and save the results.

JeffBezanson commented 5 years ago

I don't believe we have the ability to do this yet. We should add it.

skanskan commented 5 years ago

It would be great.