Closed ianbtr closed 4 years ago
Adding escape characters to the output file for the python parser fundamentally changes the source token text when dealing with the string within Python. For example, requesting the first character from the string once parsed by the python CSV parser, the results are:
Input => Output
\"foo => \
""foo" => f
'"foo' => '
Alternatively, setting the python CSV parser to ignore quotes yields the correct character:
reader = csv.reader(tsvfile, delimiter='\t', quoting=csv.QUOTE_NONE)
Input => Output
"foo => "
If escape characters are still desired or other tools do not offer ignoring quotes, this could be transformed via a regex find replace or equivalent. The next version of the iTrace framework (releasing this summer) will not use the TSV output approach for post processing data and will instead use a Sqlite database.
Does this help to resolve your issue?
Yes, using quoting=csv.QUOTE_NONE
fixed the problem.
Thanks!
Terrific! Happy to help.
Quote characters are not properly escaped. For instance, the following file might be produced (tabs are replaced with commas):
FIXATION_ID, X, Y, TOKEN/VIEW, SYNTATIC_CAT, XPATH, DURATION, RIGHT_PUPIL, LEFT_PUPIL
144, 121, 122, "Foo, ...
Because the quote is unescaped, the rest of the file is treated as a single entry by parsers, such as the Python CSV reader.