CustodesTechnologia / System

The repository devoted to plan, organize and execute work items pursuant to the System
0 stars 2 forks source link

Special Characters Not Converting #22

Closed Brother-Tyler closed 2 years ago

Brother-Tyler commented 2 years ago

For some reason, apostrophes [from the existing site] don't render in the update. Instead, they come out as "’" (http://beta.bolterandchainsword.com/topic/373283-updating-the-b38c-101/#comment-5797174).

Is this something that I can fix via the AdminCP (e.g., an auto-edit via the swear filter) or do we need to do something else?

sibomots commented 2 years ago

I don't know yet for sure the most effective way to fix that.

Likely a character encoding issue from Legacy to New. Character sets probably didn't get converted correctly via the automatic database upgrade back in Feb.

The brute force way to fix it is to edit the SQL tables in the beta site (upgraded site).

There might be an ACP tool for that, but I'd be suspicious of it.

The short answer is that this not hard to fix. It's just a bit tricky to find the tables and records that would hold this data. I can add it to my list of database work.

Brother-Tyler commented 2 years ago

For what it's worth, I've added it to the badword (swear) filter on the legacy site, changing that string of characters to an apostrophe. I'm not sure if that will work, but it might save you some work.

Actually, the problem is only when angled apostrophes (not sure of the technical name - apostrophes from fonts/character sets that aren't the standard apostrophes) are used. Regular apostrophes seem to work just fine.

Brother-Tyler commented 2 years ago

(transcribed over from Tech Workshop)

Æ Latin Capital Letter Ae | Unicode number: U+00C6 | HTML-code: Æ (SCÆNICUS EXTUDO - number6's title with all caps) æ Latin Small Letter Ae | Unicode number: U+00E6 | HTML-code: æ (Præfect Sociorum - WarriorFish's title with normal capitalization) Ö Latin Capital Letter O with Diaeresis | Unicode number: U+00D6 | HTML-code: Ö (PETEYSÖDES) ö Latin Small Letter O with Diaeresis | Unicode number: U+00F6 | HTML-code: ö (PeteySödes) becomes ö á Latin Small Letter a with Acute | Unicode number: U+00E1 | HTML-code: á (Selgairhán - the Phoenix Lord I created for my DIY Void Hornets aspect warriors) becomes á É Latin Capital Letter E with Acute | Unicode number: U+00C9 | HTML-code: É (ULTHWÉ) é Latin Small Letter E with Acute | Unicode number: U+00E9 | HTML-code: é (Ulthwé and Forté) becomes é  Latin Capital Letter a with Circumflex | Unicode number: U+00C2 | HTML-code:  (KHÂRN) â Latin Small Letter a with Circumflex | Unicode number: U+00E2 | HTML-code: â (Khârn) becomes â

Brother-Tyler commented 2 years ago

From what I can see, most of these have been solved by including them in the site's word filter. I haven't found any that are still showing up as the alternate characters.

sibomots commented 2 years ago

I think it's fixed too. I think it had to do with the UTF-8 character encoding process step that was performed when the site upgrade was re-run on the real HW. It completed that task PRIOR to the upgrade in earnest so I was also glad to see that the encoding problem was resolved.

I'm closing this for now. If we see any more glitches related to character-set encoding, we can re-open a new ticket.