eprintsug / EPrintsArchivematica

Digital Preservation through EPrints-Archivematica Integration - An EPrints export plugin to Archivematica
6 stars 1 forks source link

wide character warnings during process_transfers #21

Open photomedia opened 4 years ago

photomedia commented 4 years ago

I get the following warnings, periodically in a portion of the records when running process_transfers:

Wide character in print at /opt/eprints3/perl_lib/EPrints/XML.pm line 717. Wide character in print at /opt/eprints3/lib/plugins/EPrints/Plugin/Export/Archivematica/EPrint.pm line 167.

I don't know the source of the issue, some miscoded (non-UTF-8) unicode characters in the eprint metadata?

geo-mac commented 4 years ago

I don't know the source of the issue, some miscoded (non-UTF-8) unicode characters in the eprint metadata?

I was obsessed with these errors when we first began testing. But, as far as I could tell, it is exactly as you suggest: non-UTF8 characters. I fixed some of the affected records to observe the effect and the error would no longer be flagged; however we will not be able to do this for every record. We will therefore have to grin and bear it! Having said that, it will be difficult to anticipate the unintended consequences of this. Archivematica does not appear to have experienced any issues thus far.

mpbraendle commented 4 years ago

Have a look at https://perlgeek.de/en/article/encodings-and-unicode ;-)