dashbitco / nimble_csv

A simple and fast CSV parsing and dumping library for Elixir
https://hexdocs.pm/nimble_csv
772 stars 51 forks source link

Transform stream to line based one before trying to parse #57

Closed LostKobrakai closed 4 years ago

LostKobrakai commented 4 years ago

Sometimes streams are not yet line based (e.g. when streaming over http). I'm wondering if it makes sense for nimble_csv to optionally deal with such by doing something akin to this before parsing the data:

chunk_fun = fn element, acc ->
  parts = String.split(element, "\n")

  case List.pop_at(parts, -1) do
      {nil, []} -> {[], acc}
      {new_acc, []} -> {[], new_acc}
      {new_acc, [h | t]} -> {[acc <> h | t], new_acc}
  end
end
after_fun = fn
  "" -> []
  acc -> [acc]
end

[
    "abc",
    "def\n",
    "abc\ndef",
    "abc\ndef\nmore",
    "\nok"
]
|> Stream.transform(fn -> "" end, chunk_fun, after_fun)
|> Enum.into([])
|> IO.inspect()
# ["abcdef", "abc", "defabc", "def", "more"]
josevalim commented 4 years ago

A PR is welcome! Although I would expose it as a separate function, as to not add overhead to the existing parse_stream. :)