npdoty / planworld

Automatically exported from code.google.com/p/planworld
GNU General Public License v2.0
3 stars 1 forks source link

text encoding problem only on send messages #11

Closed npdoty closed 6 years ago

npdoty commented 6 years ago

Send messages with smart quotes or em dashes (or other non-basic-Latin-1 characters) are getting replaced by �.

Doesn't affect plans, though, only send.

npdoty commented 6 years ago

I think this is because send messages get a separate PHP step of htmlentities before being saved, which plans don't. And htmlentities requires knowing the encoding and old versions of PHP (like ours is) assume iso-8859-1, which doesn't work for users who enter characters not in that charset. (Presumably Asian language users, for example, would be having this problem all the time, but it also applies to smart quotes that iOS inserts, or em-dashes.)

Here's the relevant line in the Send code: https://github.com/npdoty/planworld/blob/master/lib/Send.php#L53 And the PHP documentation: https://secure.php.net/manual/en/function.htmlentities.php

So we could either remove the htmlentities step altogether, or we can tell it to assume UTF-8 during encoding, which probably won't always work, but will probably work a larger portion of the time than assuming iso-8859-1. I'm not sure if htmlentities is necessary since we aren't doing that for plans, but it does seem like it would probably catch some annoying things.

npdoty commented 6 years ago

I've deployed a fix to Neon and Krypton that specifies htmlentities(strip_tags(addslashes($message)), ENT_COMPAT, "UTF-8") and I think this issue is now resolved.