maubot / rss

A RSS plugin for maubot
GNU Affero General Public License v3.0
69 stars 21 forks source link

failed to load feed: https://sciencebasedmedicine.org/feed #36

Closed kushal-kumaran closed 2 years ago

kushal-kumaran commented 2 years ago
!rss subscribe https://sciencebasedmedicine.org/feed
Failed to load feed: <unknown>:7:2: mismatched tag

However, the feed appears to be well-formed.

>>> import requests
>>> import feedparser
>>> r=requests.get("https://sciencebasedmedicine.org/feed", headers={"User-Agent": "maubot/0.3.1 https://github.com/maubot    /rss"})
>>> p = feedparser.parse(r.text)
>>> p["feed"]["title"]
'Science-Based Medicine'
>>> 

(My test was with @rss:t2bot.io. I'm not sure what version of the bot is in use there)

tulir commented 2 years ago
$ curl https://sciencebasedmedicine.org/feed -H "User-Agent: Python" -Li
HTTP/2 403 
date: Fri, 12 Aug 2022 15:23:21 GMT
content-type: text/html
vary: Accept-Encoding
cf-cache-status: DYNAMIC
expect-ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
server: cloudflare
cf-ray: 739a36339da59d58-DME
alt-svc: h3=":443"; ma=86400, h3-29=":443"; ma=86400

<html>
<head><title>403 Forbidden</title></head>
<body>
<center><h1>403 Forbidden</h1></center>
<hr><center>nginx</center>
</body>
</html>

Newer versions of the plugin use a custom user agent (https://github.com/maubot/rss/commit/877dcffb9c695ad0320fc6f5079f626d8bae452b), but previously it was the default aiohttp user agent (Python/1.2 aiohttp/3.4.5)