Astrotomic / php-twemoji

Easily generate Twemoji URLs
MIT License
31 stars 13 forks source link

Add HTML parser/replacer #2

Open Gummibeer opened 3 years ago

Gummibeer commented 3 years ago

The original JavaScript Twemoji client allows replacing all emojis in a given text with the corresponding Twemoji image tag in one go. We should provide a similar method like Twemoji::parse($html) which will search for emojis and replace them.

As this requires a DOM/HTML parser to don't replace emojis in image alternate attributes for example this should be an opt-in feature. SO the DOM library shouldn't be part of the default dependencies but the suggestions.

The method should also be flagged as experimental so everyone knows that it's possible that this method makes trouble.

To test this we should use snapshot testing on some blog posts for example. (could be faked) They should cover emojis in plain text .txt, in the displayed content of HTML (outside of tags, in tag(s), with body or without) and as part of HTML attributes.

content parser and emoji replacer for:

mallardduck commented 2 years ago

Had twemoji concepts on my mind recently because of how poorly I noticed Slack handles their implementation. Specifically, compared to twitters on-site implementation (meaning the results on twitter.com) it's really bad on slack. For instance on twitter you can freely copy tweet text and ensure the emojis are preserved.

Granted where you paste them to provides varying results based on that program/app. However any app that can take "plain text" and supports unicode/emoji will gladly take the paste and keep the emoji in place. (again with small edge-cases about device compatibility and such.)

All of that in mind, I was thinking it'd be cool to make sure this package is able to "do the right thing". The fix for this is kinda simple TBH. When rendering the twemoji image tag, set an alt text to the unicode for the emoji.

I think that the feature I'm talking about here and the idea of parsing a block of content have some important overlap. At the very least in the sense that for blog posts use case I'd want the generated HTML to include the proper alt text for accessibility. Long story short, LMK if you'd be open to me taking a pass at solving this issue and working in this accessibility feature too.

Gummibeer commented 2 years ago

Could be that I'm dumb but so far I see and know my code it does exactly what you want!? https://github.com/Astrotomic/php-twemoji/blob/960bc12c1e156a21a869efdb9045ed42f54a2c6c/src/EmojiText.php#L42-L65 https://github.com/Astrotomic/php-twemoji/blob/960bc12c1e156a21a869efdb9045ed42f54a2c6c/tests/__snapshots__/ReplacerTest__it_can_replace_emojis_in_plain_text_to_html__1.txt

So the original Emoji 🚀 is the alt of the image!? 🤔

Gummibeer commented 2 years ago

Regarding your offer: for sure you can start with the open part of that issue, parsing HTML.