MikeStall / DataTable

Class library for working with tabular data, especially CSV files
Apache License 2.0
150 stars 63 forks source link

Allow for different separators #3

Closed munissor closed 12 years ago

munissor commented 12 years ago

Hi Mike,

at the moment the library supports only commas and tabs as separators. Would be nice to allow users of the library to specify custom separator.

Software like Excel, Open Office, Libre Office, ... use the list separator of the system locale when exporting as CSV (in Italian the separator is ';' ).

Riccardo

MikeStall commented 12 years ago

Good point, thanks. The underlying parser actually supports this (https://github.com/MikeStall/DataTable/blob/master/Sources/DataAccess/Readers.cs ), but it's not surfaced up to the public builder functions.

One possible fix would be to append Reader.GuessSeparateFromHeaderRow to check for ';'. This is an internal function called when the reader tries to guess the delimiter from the file contents. If this were updated, then DataTable.New.Read() would automatically pick up a ';' if it appeared in the header row.

If you want to make a patch for that, I'd accept it.

Another fix would be to surface the delimiter parameter in the builder functions in https://github.com/MikeStall/DataTable/blob/master/Sources/DataAccess/DataTableBuilder.cs.

MikeStall commented 12 years ago

Just pushed a fix.

  1. default guess will now look for semicolons (lower priority than a comma)
  2. Added a new overload to DataTable.New.Read that explicitly takes in a delimiter so you can pass anything.