gugray / rss-parrot

Notifies Mastodon accounts about new posts in the RSS feeds they follow
https://rss-parrot.net
MIT License
109 stars 7 forks source link

Irrawaddy News not working #24

Closed wadamT closed 7 months ago

wadamT commented 7 months ago

A popular Burmese news website feed is not working for me. I can add the feed and it works on Fluent Reader on Linux and Feeder app on Android. RSS Parrot is returning

Hm, I can't find a feed for this site. Is the address right? It can also happen that the site is temporarily down, or it doesn't have a valid RSS or Atom feed.

Link to website - https://burma.irrawaddy.com Link to feed - https://burma.irrawaddy.com/category/news/feed

gugray commented 7 months ago

This is what I see in the log for this site:

2024-01-10 13:38:38.327 INFO <logic/feed_follower.go:431> Retrieving site information: https://burma.irrawaddy.com/category/news/feed 2024-01-10 13:38:38.359 WARN <logic/feed_follower.go:191> Failed to get https://burma.irrawaddy.com/category/news/feed: Get "https://burma.irrawaddy.com/category/news/feed": context deadline exceeded (Client.Timeout exceeded while awaiting headers) 2024-01-10 13:38:38.359 INFO <logic/inbox.go:366> Could not create/retrieve account for site: https://burma.irrawaddy.com/category/news/feed: Get "https://burma.irrawaddy.com/category/news/feed": context deadline exceeded (Client.Timeout exceeded while awaiting headers) 2024-01-10 13:51:06.690 INFO <logic/feed_follower.go:431> Retrieving site information: https://www.irrawaddy.com/category/news/feed 2024-01-10 13:51:06.715 WARN <logic/feed_follower.go:191> Failed to get https://www.irrawaddy.com/category/news/feed: Get "https://www.irrawaddy.com/category/news/feed": context deadline exceeded (Client.Timeout exceeded while awaiting headers) 2024-01-10 13:51:06.716 INFO <logic/inbox.go:366> Could not create/retrieve account for site: https://www.irrawaddy.com/category/news/feed: Get "https://www.irrawaddy.com/category/news/feed": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

Either this is a transient network error (or the site is temporarily slow), or something between the website and the parrot is blocking the parrot's requests. The birb uses a 10-second timeout.

Hope this helps!

gugray commented 7 months ago

Turns out this may totally have been an issue with the Parrot! By mistake the birb was using a 10 msec timeout instead of a 10 second one in one place. Fixed via #27. Could you check if it works now?

wadamT commented 7 months ago

I have checked and RSS parrot is still not finding the feed.

gugray commented 7 months ago

Now the logs show that the birb's request is rejected with a 403 ("not authorized") error code when I request the site itself. This typically happens when the website blocks the caller for being a bot, or based on its IP address.

But, I have been able to retrieve the feed directly; that URL doesn't seem to block bots. That Parrot account is https://toot.community/@burma.irrawaddy.com.category.news@rss-parrot.net