uogbuji / pybibframe

OBSOLETE. Now maintained at https://github.com/zepheira/pybibframe ... Some open-source tools for working with BIBFRAME (see http://bibframe.org)
Apache License 2.0
3 stars 14 forks source link

Indications of i18n problems converting MARC21/binary to MARC/XML (re: pymarc) #1

Open uogbuji opened 10 years ago

uogbuji commented 10 years ago

We use pymarc to parse MARC21 (e.g. in marcbin2xml), but it has serious i18n issues, as we've encountered in practice, and as indicated in this thread which is very confusing and contains several worrying comments with regard to understanding of unicode support in Python.

We might have to find an alternative tool. yaz-marcdump was mentioned.

uogbuji commented 10 years ago

OK added info on yaz-marcdump to the README. In the wild we've encountered MARC21 with different encodings so some guesswork might need to be done for the input formats, e.g.:

yaz-marcdump -i marc -o marcxml -f iso-8859-1 -t UTF-8 scratch/records.mrc > scratch/records.mrx

Where one can mess around with the -f param until one gets good character output.