progval / Limnoria

A robust, full-featured, and user/programmer-friendly Python IRC bot, with many existing plugins.
https://docs.limnoria.net/
Other
615 stars 172 forks source link

Web: follow html/javascript redirects #1120

Open Mikaela opened 9 years ago

Mikaela commented 9 years ago

It would be more useful if title could follow through html/javascript redirects instead of saying only the title of redirecting page as the bot should be able to look into those tags like it looks into <title>.

This is took from jekyll-redirect page https://mikaela.info/r/cnchigit and the whole page from curl seems to be:

 <!DOCTYPE html>
<meta charset=utf-8>
<title>Redirecting...</title>
<link rel=canonical href="https://github.com/Antergos/Cnchi/issues/303#issuecomment-94838586">
<meta http-equiv=refresh content="0; url=https://github.com/Antergos/Cnchi/issues/303#issuecomment-94838586">
<h1>Redirecting...</h1>
<a href="https://github.com/Antergos/Cnchi/issues/303#issuecomment-94838586">Click here if you are not redirected.</a>
<script>location='https://github.com/Antergos/Cnchi/issues/303#issuecomment-94838586'</script>
jlu5 commented 9 years ago

Meta refresh tags aren't super hard to parse, but JavaScript redirects are a lot more complicated. Since JS is a language in itself, the code won't always be parse-able as plain text, and running a JavaScript interpreter for a title snarfer is simply overkill IMO (and might be insecure too).