frictionlessdata / tableschema-rb

A Ruby library for working with JSON Table Schema.
MIT License
12 stars 10 forks source link

Ability to stream data #10

Closed roll closed 7 years ago

roll commented 8 years ago

I'm no really familiar with Ruby so here - https://github.com/theodi/jsontableschema.rb/blob/master/lib/jsontableschema/table.rb - do data go to memory completely?

In python we have to kind of getting data methods:

roll commented 8 years ago

Haven't we still read everything into memory here - https://github.com/theodi/jsontableschema.rb/blob/master/lib/jsontableschema/table.rb#L17?

pezholio commented 8 years ago

D'oh, you're right

roll commented 8 years ago

So now we do:

for row in table.rows():
     print(row)

And we'll be loading into memory only one row per iteration?

pezholio commented 8 years ago

Yeah, that's right - we initialise the CSV with CSV.new and an IO Object (usually a file pointer, but this can be a StringIO if an array is passed in) then each_with_index here loads the data row by row. Thanks for noticing that! :+1:

roll commented 8 years ago

@pezholio Awesome!

We've got another review from our Ruby specialist @georgiana-b. She's asked to ask about this line https://github.com/theodi/jsontableschema.rb/blob/master/lib/jsontableschema/data.rb#L8 - as a potential load everything to memory point.

But we don't want to block the whole work by this feature if it sill has problems with streaming. Please just open issue if it still actual. And sorry if not.

roll commented 7 years ago

@georgiana-b I'm re-opening for now. It's a backlog but if you will get some info on this topic working on v1 upgrade please write here. I'm not sure what's the status of it.