nextstrain / augur

Pipeline components for real-time phylodynamic analysis
https://docs.nextstrain.org/projects/augur/
GNU Affero General Public License v3.0
268 stars 129 forks source link

Fix curate internal quotes take 2 #1565

Open joverlee521 opened 2 months ago

joverlee521 commented 2 months ago

Description of proposed changes

Follow the general pattern of creating CSV-like TSVs as discussed in https://github.com/nextstrain/augur/pull/1563#discussion_r1699260164.

We are expecting the CSV-like double quoting when there are internal quotes. If the field value is already correctly double quoted, then there should not be any additional quotes.

Related issue(s)

Resolves https://github.com/nextstrain/augur/issues/1312

Checklist

codecov[bot] commented 2 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 70.22%. Comparing base (c28fab2) to head (ad029f6).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #1565 +/- ## ======================================= Coverage 70.22% 70.22% ======================================= Files 76 76 Lines 7987 7987 Branches 1947 1947 ======================================= Hits 5609 5609 Misses 2092 2092 Partials 286 286 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

tsibley commented 2 months ago

Let's discuss this topic more first. I'll open an issue for it.

tsibley commented 2 months ago

https://github.com/nextstrain/augur/issues/1566

tsibley commented 4 weeks ago

Saving this here for later…

Quote in field value in CSV input.

  $ augur curate passthru --metadata <(echo 'a,b'; echo 'x,"y""z"') --output-metadata -
  a b
  x "y""z"

Quote in field value in NDJSON input.

  $ augur curate passthru --output-metadata - <<<'{"a":"x","b":"y\"z"}'
  a b
  x "y""z"

Tab in field value in NDJSON input.

  $ augur curate passthru --output-metadata - <<<'{"x":"\u0009"}'
  x
  " "
joverlee521 commented 1 day ago

I think this PR should be ready for review based on https://github.com/nextstrain/augur/issues/1566#issuecomment-2356678703.