LibreCat / Catmandu-MARC

Catmandu modules for working with MARC data
https://metacpan.org/release/Catmandu-MARC
Other
8 stars 10 forks source link

add info about how to process files with faulty Unicode characters #99

Closed jorol closed 5 years ago

jorol commented 5 years ago

If you process UTF-8 encoded files which contain faulty characters, you will get a fatal error message like:

utf8 "\xD8" does not map to Unicode at ...

Use the iconv (libc6-dev Linux package) tool, to preprocess the data and discard faulty characters:

$ iconv -c -f UTF-8 -t UTF-8 marc21.utf8.raw | catmandu convert MARC to JSON

coveralls commented 5 years ago

Coverage Status

Coverage decreased (-0.09%) to 92.69% when pulling 441de7163b9948e1a5cacb1706c2bad534a7ddeb on jorol:dev into 6c21631e7c3082a6dd168c06622d0158d3f297cf on LibreCat:dev.

coveralls commented 5 years ago

Coverage Status

Coverage decreased (-0.09%) to 92.69% when pulling 441de7163b9948e1a5cacb1706c2bad534a7ddeb on jorol:dev into 6c21631e7c3082a6dd168c06622d0158d3f297cf on LibreCat:dev.