ohare93 / brain-brew

Automated Anki flashcard creation and extraction to/from Csv
The Unlicense
89 stars 5 forks source link

Stop using codecs.open #43

Closed aplaice closed 3 weeks ago

aplaice commented 2 years ago

It seems to result in weird errors (at least under some circumstances — see #39 and #42) and in Python3 there is AFAICT no real benefit to using it(?).

Ideally™, we'd have a set of GitHub tests that would check that everything works on Windows, with arbitrary default encodings...


Aside on newlines

Should we use newline='' here as well (like in csv_file.py), to avoid newlines changing (from \n to \r\n during a read-write round-trip on Windows?

(By default, open has newline=None. This means that when reading, all reasonable newlines (\n, \r\n etc.) are converted to \n. When writing, open writes newlines as os.linesep (i.e. \r\n on Windows). Hence, if we have \n newlines in an html file, these might end up converted to \r\n after a round-trip. Also, if somebody has \r\n newlines (for some reason), then a round-trip on Linux will change them to \n.)

In contrast, newline='' means that newlines are kept as they are, both when writing and when reading. A slight issue is that html_separator_regex in brain_brew/representation/yaml/note_model_template.py would stop matching when \r\n was the newline separator used by the user in the relevant HTML template files. (html_separator_regex could be adjusted, though.) )

(I think I hate newlines almost as much as encoding issues :))