Closed acka47 closed 7 months ago
Not sure how to fix this with Metafacture. The encoding problem seems to be already given by the source data. Perhaps @blackwinter or @dr0i can help here.
Not sure - but it strikes me odd that this is exactly what the input already shows. So:
a) the input needs a special treatment in the first place (file
shows:
$ file metafacture-runner/src/main/dist/examples/read/marc21/10.marc21 metafacture-runner/src/main/dist/examples/read/marc21/10.marc21: MARC21 Bibliographic`
(but does not show the Umlaut properly in my UTF8 terminal )) or
b) the input file is not stored correctly (broken characters instead of UTF8 or enhanced ASCII - opens the question what character set the MAR21 should use)
I found some examples without encoding problems:
https://raw.githubusercontent.com/metafacture/metafacture-tutorial/main/data/sample.marc21
@acka47 should I use these instead?
The input file seems to be MARC-8 encoded. From the spec
In a MARC-8-encoded MARC 21 record, Leader character position 9 (Character coding scheme) must contain a space character (20(hex)).
Conversion from MARC-8 to Unicode can be done with tools like yaz-marcdump or MarcEdit.
The input file seems to be MARC-8 encoded. From the spec
In a MARC-8-encoded MARC 21 record, Leader character position 9 (Character coding scheme) must contain a space character (20(hex)).
Conversion from MARC-8 to Unicode can be done with tools like yaz-marcdump or MarcEdit.
The example is quite old. At least MF does not say it is not UTF-8. Other MARC-8 examples like the one in PyMarc throw errors when being transformed with MF and need a workaround.
I just went through https://github.com/metafacture/metafacture-documentation/blob/master/MF-in-5-min.md. In the MARC example, there are problems when I run it in the MF playground:
Is there a way to fix this in the flux or is it a Playground problem?