Closed joverlee521 closed 1 week ago
The escaped double quotes (""
for internal quotes) are more correct, no?
The escaped double quotes ("" for internal quotes) are more correct, no?
Hmm, yes. I guess the issue comes up when running through augur curate
multiple times.
The double quotes keep getting added on with each pass. This is an example of a string with internal quotes that goes through curate 3 times:
SRC VB "Vector", Molecular Biology of Genomes
"SRC VB ""Vector"", Molecular Biology of Genomes"
"SRC VB ""Vector"""", Molecular Biology of Genomes"""
"SRC VB ""Vector"""""""", Molecular Biology of Genomes"""""""
OH! Yeah, that's a misconfiguration of the parser/producer then.
Current Behavior
When using the
--metadata
input, field values with double quotes in them can result in additional double quotes in the output.Since the metadata is read through csv.DictReader, we can probably tweak this behavior through the csv.Dialect attributes
https://github.com/nextstrain/augur/blob/961cb0042c6744cff3925dd97251187e4532a082/augur/io/metadata.py#L183
Additional context
This was first observed in https://github.com/nextstrain/monkeypox/pull/179