SciRuby / daru

Data Analysis in RUby
BSD 2-Clause "Simplified" License
1.03k stars 139 forks source link

symbols as column headers via symbol_headers=true|false option to from_csv #362

Closed parthm closed 7 years ago

parthm commented 7 years ago

While reading a CSV, it might be convenient to optionally have headers be symbols instead of strings. As of now I just read the CSV and convert the convert the headers using the code below.

  df.vectors = Daru::Index.new(df.vectors.to_a.map(&:downcase).map(&:to_sym))

This allows us to conveniently access vector content v[:foo] and dataframe vectors df[:bar] which arguably is a more ruby like way to access values. This is intuitive as symbols are frequently used as keys for ruby data structures. Perhaps a flag like symbol_headers=true|false can be supported in the from_csv method. While writing, they symbols are obviously stored as strings and nothing changes at that front.

gnilrets commented 7 years ago

You can use whatever header converters you want by passing options to the standard Ruby CSV parser (http://ruby-doc.org/stdlib-2.0.0/libdoc/csv/rdoc/CSV.html#method-i-header_converters). Here's one way to get what you want:

df = Daru::DataFrame.from_csv myfile, { headers: true, header_converters: CSV::HeaderConverters[:symbol] }
parthm commented 7 years ago

Ah. Perfect. Thanks for pointing it out @gnilrets . I was not aware of that.

v0dro commented 7 years ago

I think problem is solved. Closing.