Closed jasloe closed 5 years ago
@jasloe Indeed Catmandu uses UTF-8 by default there is on this page some hints how to preprocess files that have the wrong encoding: https://metacpan.org/pod/release/HOCHSTEN/Catmandu-MARC-1.251/lib/Catmandu/MARC/Tutorial.pod
In your fixes there is a wrong usage of the marc_replace_all
command. This Fix needs three arguments:
In your case the marc_replace_all
should have been written like:
marc_replace_all('245a','^.*$',$.identifier)
Or with the marc_set
Fix this could be easier written as:
marc_set(245a,$.identifier)
Can this be closed?
I am working with a dataset that contains some poorly encoded strings, i.e.:
I am passing these records through a lookup with values containing the correct character set and encoding:
e.g.:
This is working fine, however the output is not what I was expecting:
I'm not entirely clear what's going on here. All of the resources I am working with are in the UTF8 domain. Moreover, I understand Catmandu uses UTF8 default. I've tried converting from MarcMaker to ISO and vice versa without any luck. Ideas? Rather stumped....