omnivore-app / omnivore

Omnivore is a complete, open source read-it-later solution for people who like reading.
https://omnivore.app
GNU Affero General Public License v3.0
11.17k stars 578 forks source link

Bug: There was an error adding new feed: Item not found #3672

Open flethj opened 3 months ago

flethj commented 3 months ago

I'm running into an issue when trying to import an RSS feed.

Feed: https://openrss.org/https://www.youtube.com/@ByteByteGo/videos

Error: "There was an error adding new feed: Item not found"

The feed appears to exist and it can be imported into RSS readers (I tried Reeder and it worked). However, it does not work in Omnivore. I tried both the web and iOS versions.

jacksonh commented 3 months ago

Hey @flethj are you directly adding that URL? It looks like an HTML page to me but maybe its meant to change based on accepted content type?

flethj commented 3 months ago

Hello, I tried to add exactly this url. The returned content probably depends on something like the accepted content type. Chrome shows the HTML page but for example Safari recognizes the RSS feed and asks whether to open it in an RSS reader.

The weird thing is that I've successfully used openrss.org in combination with YouTube channels in Omnivore before. Exactly as above but with a different channel. A different RSS client is also (still) able to see the feed with this exact url, which is why I suspect that something in Omnivore broke.

jacksonh commented 3 months ago

Yeah its likely because we added some extra accepted content types because a few other feeds required things like HTML be accepted, probably due to misconfigured load balancers.

niksart commented 3 months ago

I get the same error adding this feed:

https://direct.mit.edu/rss/site_1000093/1000049.xml

I guess the problem has to do with the presence of an underscore in the link. Should I open a new issue?

jacksonh commented 3 months ago

I get the same error adding this feed:

https://direct.mit.edu/rss/site_1000093/1000049.xml

I guess the problem has to do with the presence of an underscore in the link. Should I open a new issue?

curl https://direct.mit.edu/rss/site_1000093/1000049.xml 
<html><body><h1>403 Forbidden</h1>
Request forbidden by administrative rules.
</body></html>

looks like they have some blocking of that feed.

niksart commented 3 months ago

It seems that the problem is the user agent "curl" that is blocked by them:

curl -A "curl" "https://direct.mit.edu/rss/site_1000093/1000049.xml"
<html><body><h1>403 Forbidden</h1>
Request forbidden by administrative rules.
</body></html>

Any other string works. Try for example:

curl -A "qwertyuiop" "https://direct.mit.edu/rss/site_1000093/1000049.xml"
jacksonh commented 3 months ago

Yeah i suspect they are also blocking some IPs, i can add this feed fine in local development but can't from our backend.

fabianlandwehr1 commented 2 months ago

Hello, are there any plans to change the accepted content types again (so that openrss.org works again)?

CopyPasteFail commented 2 months ago

Same problem for me here:

$ curl https://orikatz.wordpress.com/feed/

`<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"

............`

The URL works in feedly.

micmalti commented 3 weeks ago

I'm facing the same issue with the DistroWatch RSS feeds.

jacksonh commented 3 weeks ago

Same problem for me here:

* https://www.geektime.co.il/feed/

$ curl https://orikatz.wordpress.com/feed/

<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/" > <channel>............

The URL works in feedly.

I get this: https://validator.w3.org/feed/check.cgi?url=https%3A%2F%2Fwww.geektime.co.il%2Ffeed%2F

jacksonh commented 3 weeks ago

I'm facing the same issue with the DistroWatch RSS feeds.

Can you give an example URL you are using?

micmalti commented 3 weeks ago

Sure. Here's one: https://distrowatch.com/news/dw.xml

stevenrobertson commented 3 days ago

Same issue with https://www.science.org/digital-feed/pipeline