cldf / csvw

CSV on the web
Apache License 2.0
36 stars 5 forks source link

understand other encoding variants #54

Closed xrotwang closed 3 years ago

xrotwang commented 3 years ago

The R equivalent of utf-8-sig seems to be UTF-8-BOM. A CSV dialect spec reader should understand such common variants as well.

Other relevant names seem to be Latin-1 and latin1.

Maybe use codecs.lookup first, to be able to create useful error messages, rather than failing while trying to read the data.