mutewinter / Showbot

🤖 An IRC Bot and Website for 5by5.tv written with the Cinch and Sinatra frameworks
MIT License
90 stars 31 forks source link

Replace HTML entities in tweets pushed to IRC? #29

Open ruok5 opened 12 years ago

ruok5 commented 12 years ago

Example: <showbot> @5by5: The Critical Path with @asymco &amp; @danbenjamin is starting now - http://t.co/sI19cI8S (17 seconds ago) For what it's worth, I had the same issue with something very similar I wrote in perl, where I was passing tweets through decode_entities() as provided by HTML::Entities. The problem seemed to be solely with & not being properly decoded, and I ended up fixing that with a simple replace before I passed the tweet text on to decode_entities()--kludgy, but it worked.

mutewinter commented 12 years ago

There are also occasional issues with the website not showing entities properly on Link suggestions. I have been meaning to use http://htmlentities.rubyforge.org/ to fix this up.

Thanks for pointing this out.

ruok5 commented 11 years ago

FWIW, I was just looking at my own Twitter bot, trying to diagnose why my entity replacement didn't seem to be working, and noticed that the Twitter API returns double-escaped entities.

For example, literal < is returned as &lt;.

I solved the issue by running the text trough the filter twice.

$tweet_text = decode_entities($tweet_text);    # &amp;lt; --> &lt;
$tweet_text = decode_entities($tweet_text);    # &lt;     --> <

Crude, but effective.