LibreCat / Catmandu-MARC

Catmandu modules for working with MARC data
https://metacpan.org/release/Catmandu-MARC
Other
8 stars 10 forks source link

Character encoding output question #98

Closed jasloe closed 5 years ago

jasloe commented 5 years ago

I am working with a dataset that contains some poorly encoded strings, i.e.:

=001  TR1311
=245  10$aÀ la Albéniz$h[electronic resource]

I am passing these records through a lookup with values containing the correct character set and encoding:

key,value
TR1311,À la Albéniz

e.g.:

marc_map(001,identifier)
lookup(identifier,'lookup.csv')
marc_replace_all('245',a,$.identifier)

This is working fine, however the output is not what I was expecting:

=001  TR1311
=245  10$a{copy} l{deg} la Alb{caron}niz Alb{copy}{flat}niz$h[electronic resource]

I'm not entirely clear what's going on here. All of the resources I am working with are in the UTF8 domain. Moreover, I understand Catmandu uses UTF8 default. I've tried converting from MarcMaker to ISO and vice versa without any luck. Ideas? Rather stumped....

jasloe commented 5 years ago

Whoops meant to post this in the Catmandu issue queue. Apologies for the noise.