dashbitco / nimble_csv

A simple and fast CSV parsing and dumping library for Elixir
https://hexdocs.pm/nimble_csv
772 stars 51 forks source link

Is there any easy way to keep the headers as the keys of each row of output, if the columns are not known in advance? #37

Closed x-ji closed 5 years ago

x-ji commented 5 years ago

In the CSV library, one can simply specify headers: true to get a result like

iex> ["a;b","c;d", "e;f"]
iex> |> Stream.map(&(&1))
iex> |> CSV.decode!(separator: ?;, headers: true)
iex> |> Enum.take(2)
[
  %{"a" => "c", "b" => "d"},
  %{"a" => "e", "b" => "f"}
]

However, I haven't been able to figure out a way to easily do it with nimble_csv.

There is an example of

"name\tage\njohn\t27"
|> MyParser.parse_string
|> Enum.map(fn [name, age] ->
  %{name: name, age: String.to_integer(age)}
end)

However, this requires me to know in advance:

  1. How many columns the file will have
  2. The name of each column

My use case is to handle CSV files with potentially unknown columns. But I need to produce a map as in the first example.

Seems that the only way to do it with nimble_csv would be to

  1. specify headers: false
  2. take the head of the resulting list as the keys
  3. Perform a Enum.map on the tail of the list to add the keys one by one.

Which seems to be quite convoluted. Did I not understand the library correctly and there's an easy way to do it? Or is my use case not suited to nimble_csv here/I should rethink my approach?

josevalim commented 5 years ago

Today you can only skip the headers (if you know them upfront) OR do this:

[headers | rest] = MyParser.parse_string "name\tage\njohn\t27"
Enum.map(rest, fn row -> Enum.zip(headers, row) end)

It is not convoluted per se but something do you have to do yourself.

x-ji commented 5 years ago

Thanks for the quick reply and help. This makes sense.