The standard CSV parser in go treats double quotes as special characters. However, TAB delimited files very often do not use quoted strings values, for example Geonames data sets. In case of values which start with double quotes, the parse loads data incorrectly:
$ cat input.txt
name
"John" the first.
"Marry" the second.
$ bin/comp -f input.txt 'input'
[ { "name": "John\" the first.\n\"Marry\" the second.\n" } ]
Therefore, this commit uses a simpler and faster approach to parse TAB delimited files and treats ALL characters but the '\t' and '\n' in the ".txt." files as part of the value:
$ bin/comp -f input.txt 'input'
[ { "name": "\"John\" the first." }, { "name": "\"Marry\" the second." } ]]
The standard CSV parser in go treats double quotes as special characters. However, TAB delimited files very often do not use quoted strings values, for example Geonames data sets. In case of values which start with double quotes, the parse loads data incorrectly:
Therefore, this commit uses a simpler and faster approach to parse TAB delimited files and treats ALL characters but the
'\t'
and'\n'
in the ".txt." files as part of the value: