close2 / csv

A dart csv to list codec / converter
MIT License
98 stars 24 forks source link
csv-converter csv-parser dart

Specify (at least a major) version when adding this project as dependency. Whenever the API has incompatible changes the major version changes!

Changes from version 3 to 4:

Changes from version 2 to 3:

Changes from version 3.0 to 3.1

Changes from version 3.1 to 3.1.1

csv

A dart csv to list converter.

If you have a String of all rows with RFC conform separators and delimiters, simply convert them with:

List<List<dynamic>> rowsAsListOfValues = const CsvToListConverter().convert(yourString);

To convert to a Csv string your values must be in a List<List<dynamic>> representing a List of Rows where every Row is a List of values. You can then convert with:

String csv = const ListToCsvConverter().convert(yourListOfLists);

The default (RFC conform) configuration is:

See below if you need other settings, or want to autodetect them.

This converter may be used as transformer for streams:

final stream = Stream.fromIterable([['a', 'b'], [1, 2]]);
final csvRowStream = stream.transform(ListToCsvConverter());

Or the decoder side:

final input = File('a/csv/file.txt').openRead();
final fields = await input.transform(utf8.decoder).transform(CsvToListConverter()).toList();

The converter is highly customizable and even allows multiple characters as delimiters or separators.

Build Status

The decoder

Every csv row is converted to a list of values. Unquoted strings looking like numbers (integers and doubles) are by default converted to ints or doubles.

The encoder

The input must be a List of Lists. Every inner list is converted to one output csv row. The string representation of values is obtained by calling toString.

This converter follows the rules of rfc4180.

This means that text fields containing any delimiter or an eol are quoted.

The default configuration is:

This parser will accept eol and text-delimiters inside unquoted text and not throw an error.

In addition, this converter supports multiple characters for all delimiters and eol. Also, the start text delimiter and end text delimiter may be different. This means the following text can be parsed: «abc«d»*|*«xy»»z»*|*123
And (if configured correctly) will return ['abc«d', 'xy»z', 123]

Usage

Encoder List<List>String

If the default values are fine, simply instantiate ListToCsvConverter and call convert:

final res = const ListToCsvConverter().convert([[',b', 3.1, 42], ['n\n']]);
assert(res == '",b",3.1,42\r\n"n\n"');

Consider using the returnString = false option to work around a performance bug.

There are 2 interesting things to note:

The converter takes the following configurations either in the constructor or the convert function:

All configuration values may be multiple characters!:

const conv = const ListToCsvConverter(fieldDelimiter: '|*|',
                                      textDelimiter: '<<',
                                      textEndDelimiter: '>>',
                                      eol: '**\n');
final res = conv.convert([['a','>'], ['<<', '>>'], [1, 2]]);
assert(res == 'a|*|<<>>>**\n<<<<>>|*|<<>>>>>>**\n1|*|2');

final res2 = const ListToCsvConverter()
    .convert([['a','>'], ['<<', '>>'], [1, 2]],
             fieldDelimiter: '|*|',
             textDelimiter: '<<',
             textEndDelimiter: '>>',
             eol: '**\n',
             convertNullTo: '');
assert(res == res2);

Note that:

Decoder StringList<List>

If the default values are fine, simply instantiate CsvToListConverter and call convert:

final res = const CsvToListConverter().convert('",b",3.1,42\r\n"n\n"');
assert(res.toString() == [[',b', 3.1, 42], ['n\n']].toString());

Again please note that depending on the input not all rows have the same number of values.

The CsvToListConverter takes the same arguments as the ListToCsvConverter (except for convertNullTo) plus

In this case eol will either be '\r\n' or '\n' depending on which of those 2 comes first in the csv string. Note that the FirstOccurrenceSettingsDetector doesn't parse the csv string! For instance if eol should be '\r\n' but there is a field with a correctly quoted '\n' in the first row, '\n' is used instead.

If you csv String contains a (simple) header row, or all eols are equal this is good enough.

Feel free to submit something more intelligent.

To check your configuration values there is CsvToListConverter.verifySettings and verifyCurrentSettings. Both return an empty list if all settings are valid, or a list of errors. If the optional throwError is true an error is thrown in case the settings are invalid.

All settings must be set, i.e. not be null, and delimiters, separators and eols must be distinguishable, i.e. they may not be the start of another settings.

CSV rules -- copied from RFC4180 Chapter 2

Ad rule 3: removed as it is not relevant for this converter.

  1. Each record is located on a separate line, delimited by a line break (CRLF). For example: aaa,bbb,ccc CRLF zzz,yyy,xxx CRLF

  2. The last record in the file may or may not have an ending line break. For example: aaa,bbb,ccc CRLF zzz,yyy,xxx

  3. ... (Header-lines)

  4. Within the header and each record, there may be one or more fields, separated by commas. Each line should contain the same number of fields throughout the file. Spaces are considered part of a field and should not be ignored. The last field in the record must not be followed by a comma. For example:

    aaa,bbb,ccc

  5. Each field may or may not be enclosed in double quotes (however some programs, such as Microsoft Excel, do not use double quotes at all). If fields are not enclosed with double quotes, then double quotes may not appear inside the fields. For example:

    "aaa","bbb","ccc" CRLF zzz,yyy,xxx

  6. Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes. For example:

    "aaa","b CRLF bb","ccc" CRLF zzz,yyy,xxx

  7. If double-quotes are used to enclose fields, then a double-quote appearing inside a field must be escaped by preceding it with another double quote. For example:

    "aaa","b""bb","ccc"