Closed xrotwang closed 2 years ago
Merging #59 (3eb452f) into master (45584ad) will not change coverage. The diff coverage is
100.00%
.
@@ Coverage Diff @@
## master #59 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 16 16
Lines 2534 2540 +6
=========================================
+ Hits 2534 2540 +6
Impacted Files | Coverage Δ | |
---|---|---|
tests/test_csv_escapechar.py | 100.00% <100.00%> (ø) |
|
tests/test_dsv.py | 100.00% <100.00%> (ø) |
|
tests/test_metadata.py | 100.00% <100.00%> (ø) |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 45584ad...3eb452f. Read the comment docs.
I really find it difficult to wrap my head around this. Fortunately, our default settings presumably save us from this problem most of the times:
>>> from csvw.dsv import Dialect
>>> Dialect().escape_character
'"'
Anyway, Python 3.10 seems to be "correct" in the sense that writing and reading CSV data with escapechar='\\'
passed to both csv.writer
and csv.reader
does roundtrip data containing \\
. But that seems semantically wrong to me. I'd say that by definition you cannot use the same setting for escapechar
for writing and reading. Once data is written, it should be platform agnostic. Basically, escapechar
shouldn't have any impact upon reading at all.
Just to exemplify what I mean above: I ran the following piece of code:
import sys
import csv
row = ['spam', '\\eggs']
print(sys.version)
with open('eggs.csv', 'w', newline='') as csvfile:
spamwriter = csv.writer(csvfile, escapechar='\\')
spamwriter.writerow(row)
with open('eggs.csv', newline='') as csvfile:
spamreader = csv.reader(csvfile, escapechar='\\')
for r in spamreader:
assert r == row, '{} vs {}'.format(r, row)
break
with py3.9:
3.9.10 (main, Jan 15 2022, 18:17:56)
[GCC 9.3.0]
Traceback (most recent call last):
File "/home/robert_forkel/test2.py", line 16, in <module>
assert r == row, '{} vs {}'.format(r, row)
AssertionError: ['spam', 'eggs'] vs ['spam', '\\eggs']
and py3.10:
3.10.2 (main, Jan 15 2022, 18:02:07) [GCC 9.3.0]
Naively, I'd say py3.10 is "more" correct. But upon reflection, I'd say py3.9 is, because it does the right thing when not specifying escapechar
for reading.
So this PR simply accepts that we can't guarantee consistent roundtrip behaviour across all supported Python versions.
closes #58