SciRuby / daru

Data Analysis in RUby
BSD 2-Clause "Simplified" License
1.03k stars 139 forks source link

Suggestion: DataFrame to two-dimensional array #524

Closed janpeterka closed 4 years ago

janpeterka commented 4 years ago

I need two-dimensional array as training data in linear regression. So, I would like to add following method (maybe with some better name) to Daru::DataFrame

#dataframe.rb

def to_multi_a
    each_row.map(&:to_a)
end

used like this

multi_array = data_frame.to_multi_a

Thoughs? Better ways to do this?

Thanks for reactions :)

kojix2 commented 4 years ago

Hello. Daru is not so active these days... But I use the following idiom to get 2d array.

df.to_matrix.to_a
janpeterka commented 4 years ago

Thanks @kojix2, closing this issue

javadba commented 4 years ago

Hello. Daru is not so active these days... Really? I do see recent activity. I am looking for the ruby equivalent of R's data.table . Is this the most active repo for it?

kojix2 commented 4 years ago

I think Daru is the best datatable in Ruby and there is no alternative project.

The local ruby community in Japan is interested in Apache Arrow. But it's not a global trend right now. Apache Arrow is the project of Wes McKinney, the original creator of pandas, and is expected to be the backend for pandas in the future.

Sure, Apache Arrow will improve performance. But it doesn't answer the question of what Ruby data tables should be like?

I think Daru is an elegant answer to that question. But It's not perfect. It's a developing one...

So, some people try to build Ruby data frames with Apache Arrow as a backend in the future, but they take years and can't replace Daru right away. Maybe we need an alternative to Hadley Wickham.