TxtDot / txtdot

An HTTP proxy that parses only text, links and pictures from pages reducing internet bandwidth usage, removing ads and heavy scripts
https://txt.dc09.ru
MIT License
149 stars 5 forks source link

Confused by Mastodon URLs? #184

Open alexpdp7 opened 1 week ago

alexpdp7 commented 1 week ago

txtdot seems to choke on Mastodon URLs, such as https://mastodon.social/@fabinou ; the /@ confuses it?

artegoser commented 1 week ago

For some strange reason the stackoverflow engine is triggered.

artegoser commented 1 week ago

When all engines fail, it turns out the one that triggered the error was not the standard one, but the very first one, which is stackoverflow. Readability can't parse this page either. The problem is not the url. We need to make a custom engine for mastodon.

alexpdp7 commented 1 week ago

Oh, don't worry too much. Mastodon has an API and RSS feeds, so for me it's not a priority to have txtdot process it correctly, so I don't need a custom engine- I was just curious and testing txtdot.

I thought textder would take care of this, though?