SciRuby / daru

Data Analysis in RUby
BSD 2-Clause "Simplified" License
1.03k stars 139 forks source link

from_csv via String #534

Open reedjosh opened 4 years ago

reedjosh commented 4 years ago

Feature Request I get csv output from a command. I would like a way to give Daru.from_csv a csv string for parsing instead of a path.

In Pandas I do:

pd.read_csv(StringIO(results.stdout))

I don't know if StringIO is do-able for Daru.from_csv, but I'm working around it now via:

def csv_string_to_df(csv_str)
    csv_as_arrays = ::CSV.parse(csv_str)
    headers = csv_as_arrays.shift
    csv_as_arrays = csv_as_arrays.transpose

    hsh = {}
    headers.each_with_index do |h, i|
    hsh[h] = csv_as_arrays[i]
    end
    Daru::DataFrame.new(hsh)
end

I kind of think that if you just attempted to read the file first and then gave that to CSV.parse then I could use ruby's StringIO in place of a path since it would be read outside of the CSV lib.

Thank you!

athityakumar commented 4 years ago

@reedjosh - Thanks a lot for suggesting this feature! 🎉

We currently have a read_csv method that reads from a csv file. If you want to have a similar from_csv method that imports from a CSV String / StringIO, it'd involve a very minor code change in the daru-io repository

Specifically in this file, you can add a from method that looks like this:

def from(file_data)
  @file_data = file_data
  self
end

and expose a from_csv method liek this in the same file:

Daru::DataFrame.register_io_module :from_csv, self

Would you be willing to contribute this feature, @reedjosh?

reedjosh commented 4 years ago

I would love to. It will be a bit before I can get to it though. Will update you when I get going on it.