Open johnpeart opened 10 months ago
This is really strange, thanks for the details. I'm trying to track down when this started happening. It seems to me that anything before August has emoji, but after August that's not the case. I did do a bunch of maintenance work in August on this, in particular to make it accept Emoji in a URL itself. That's bizarre that it would have broken storing emoji tho! I will investigate more.
Thanks Aaron. π
I did try skimming through the previous commits to see if I could identify anything obviously changed, but I think it's beyond my amateurish coding skills!
The only related commit I can figure out is this one, which sets a property on the database connection:
https://github.com/aaronpk/webmention.io/commit/1df6373363071c2d93e85d4d20b47fbe41424ed6
I reverted that commit and that seems to be storing emoji properly now. Thanks for catching that. I'm assuming there is now something else broken with emoji in URLs that this was supposed to fix, but that happens much less often than emoji in post contents so we'll deal with that separately later.
This appears to still be happening even for mentions created since the revert of that commit. Example from one on my site here:
{
"type": "entry",
"author": {
"type": "card",
"name": "Robb Knight",
"photo": "https://webmention.io/avatar/media.social.lol/d2b5943b2e687ef31399f8241bd07d88b1140716a89c5e026038eaf8ec5341b7.jpg",
"url": "https://social.lol/@robb"
},
"url": "https://social.lol/@robb/111416365374993541",
"published": "2023-11-15T20:06:57+00:00",
"wm-received": "2023-11-15T20:27:01Z",
"wm-id": 1738813,
"wm-source": "https://brid.gy/comment/mastodon/@robb@social.lol/111415917630210220/111416365374993541",
"wm-target": "https://rknight.me/using-the-johnny-decimal-system/",
"wm-protocol": "webmention",
"content": {
"html": "<p><span class=\"h-card\"><a href=\"https://hachyderm.io/@johnnydecimal\" class=\"u-url\">@<span>johnnydecimal</span></a></span> No better way to find a typo than by publishing the post ????</p><p>Will be fixed in the next few minutes.</p>",
"text": "@johnnydecimal No better way to find a typo than by publishing the post ????Will be fixed in the next few minutes."
},
"in-reply-to": "https://rknight.me/using-the-johnny-decimal-system/",
"wm-property": "in-reply-to",
"wm-private": false
}
I can confirm this issue still happen, at least on my site https://ahmad.build/shortupdate-25-12-2023/#webmentions
Happening for me here too: https://www.joelotter.com/notes/2024/02/05-pacific-drive/
Interestingly it's happening for author names as well as the post content. @aaronpk could this be reopened?
This is happening for me too β is there anything I can provide from the places where I'm seeing it that'd make it easier to debug?
Same here ... Emojis in content.txt and .html, coming from brid.gy are replaced by questionmarks:
Issue
Webmentions sent to the www.webmention.io service containing emoji β like π and π’ β are sometimes replaced with ??? and ???? when www.webmention.io sends the Webmention to its destination site.
Expected behaviour
The Webmention should include the original emoji.
Further info
This may be a new bug; emoji were being sent via the service successfully in the past. When I integrated with the www.webmention.io service several weeks ago, emoji were not being replaced. This blog post from my personal site shows a Webmention that successfully includes the emoji. This Webmention originates from a Mastodon post, piped in via Brid.gy and works as expected.
This more recent blogpost, which features a range of Webmentions from peopleβs personal blogs and from Mastodon posts piped in via Brid.gy all have their emojis stripped and replaced with β???β or β????β.
Looking at the raw data from the www.webmention.io API shows that the emoji are not present in the most recent API data (but are on the older posts). Hereβs a link to the web mentions page for my site, in case needed to demonstrate.