Closed archy-bold closed 9 years ago
Thanks! All diacritics are coming through fine for me...can you be more specific about where you're seeing this (file/line no.) and how you're viewing the data?
I know my facility with Python encoding is abysmal, so the problem (once located) is almost certainly on my end.
On Apr 30, 2015, at 3:23 PM, Mitch notifications@github.com wrote:
Thanks! All diacritics are coming through fine for me...can you be more specific about where you're seeing this (file/line no.) and how you're viewing the data?
— Reply to this email directly or view it on GitHub https://github.com/nyphilarchive/PerformanceHistory/issues/3#issuecomment-97935386.
I just found an example...line 16610 of complete.xml. I'll check what's in our Solr index, since that's where this data is originating. Possible that I need to re-index some records.
Should have been more specific, there, didn't realise some of them were encoded ok.
There are other examples in the data. Searching for the character Ã
is always a sure fire way of locating them.
The issue was with the migration from our database of record into the Alfresco repo. Made fix in migration config and re-migrated metadata so this is gremlin is now removed.
I've noticed that the encoding seems to have garbled the accented characters.
For example, Frédéric Chopin appears in the data as
Chopin, Frédéric
.Great repository, otherwise!