psychoinformatics-de / datalad-tabby

DataLad extension package for the "tabby" dataset metadata specification
Other
1 stars 5 forks source link

Add an encoding parameter to `io.load_tabby` #116

Open mslw opened 11 months ago

mslw commented 11 months ago

This PR resolves #112 by adding an optional encoding parameter to io.load_tabby. The parameter can be used to specify encoding for reading tsv files.

When not specified (encoding=None), we keep the default behavior (implicitly using locale.getencoding() ^1,^2).

With external libraries it might be possible to guess a file encoding that produces a correct result based on the files content, but the success is not guaranteed when there are few non-ascii characters in the entire file (think: list of authors). I made an attempt with #114 but didn't like it in the end. Here, we do not attempt to guess, instead expecting the user to know the encoding they need to use.

This PR also fixes an unrelated documentation typo to satisfy the codespell checks.