Athou / commafeed

Google Reader inspired self-hosted personal RSS reader.
https://www.commafeed.com
Apache License 2.0
2.81k stars 377 forks source link

Error while adding gazzetta.it rss feeds #1260

Open Paolo7297 opened 9 months ago

Paolo7297 commented 9 months ago

Whenever I try to add a gazzetta.it feed (Like https://www.gazzetta.it/dynamic-feed/rss/section/Calcio/Serie-A.xml), it throws this error: org.xml.sax.SAXParseException: DOCTYPE is disallowed when the feature "http://apache.org/xml/features/disallow-doctype-decl" set to true.. There's a way to bypass the error? It works great in other rss reader. Thanks!

Screenshot 2024-02-21 alle 14 21 15

Athou commented 9 months ago

There is a DOCTYPE declaration at the top of the feed, which is unusual.

image

The parser CommaFeed is using actively blocks feeds with a DOCTYPE declaration for security reasons (see https://github.com/rometools/rome/issues/203 and https://en.wikipedia.org/wiki/Billion_laughs_attack).

Maybe I can remove the DOCTYPE from the XML before the parsing occurs, I'll see what I can do.

In the mean time, you could contact the website to ask them to remove the DOCTYPE declaration.

Paolo7297 commented 9 months ago

Thanks! Actually their contact form isn't working, I hope it will in the next days

travisbeard commented 7 months ago

I also have several that could not be imported when switching from feedly. It would be nice to have an option ignore.

Athou commented 7 months ago

I also have several that could not be imported when switching from feedly. It would be nice to have an option ignore.

Do you get the same error as above? What are the feed urls that are not working?

travisbeard commented 7 months ago

This is no longer bothering me. I was able to find the sites all had 2 feeds, one with and one without. The website parser in commafeed finds the wrong one by default, but i was able to find the 2nd feed for all these sites worked.

dstutz commented 1 month ago

I am also getting this exact error on some private GitLab EE Activity feeds, and I can't add them. I would love to have these feeds available.

The feed doesn't appear to have a doctype, though.

<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/">
<title>******activity</title>
<link href="https://**********?feed_token=glft-******" rel="self" type="application/atom+xml"/>

GitLab Enterprise Edition v16.11.10-ee

I was able to get an issues feed subscribed.

Athou commented 1 month ago

I seem to able to subscribe to this URL though. Are you having a DOCTYPE is disallowed error? Do you have an error in the log files with a stacktrace?

dstutz commented 1 month ago

In the UI: image

Nothing showing up in the logs. I am also still running v4.6.0. I've been putting off updating the config for the new Quarkus versions

Yeah, I tested with some public gitlab.org projects and it works fine there. I don't know if it's specifically the EE version I'm trying to subscribe to or something else. Like I said, I was able to subscribe to the issues feed for one of the sub-projects.

dstutz commented 1 month ago

Ok...changed to DEBUG level and got some output. Maybe the way that instance is setup the feed token isn't working right, it looks like it might be redirecting to a login page for some reason and I guess THAT is what is not parsing correctly (yes...at the top of the login page: <!DOCTYPE html>).

DEBUG [2024-10-04 10:08:16,878] com.commafeed.backend.HttpGetter: fetching https://*****feed url with token******
DEBUG [2024-10-04 10:08:17,728] com.commafeed.frontend.resource.FeedREST: Could not parse feed from https://*****/users/sign_in : Invalid XML: Error on line 1: DOCTYPE is disallowed when the feature "http://apache.org/xml/features/disallow-doctype-decl" set to true.

 Causing: com.rometools.rome.io.FeedException: Could not parse feed from https://*******/users/sign_in : Invalid XML: Error on line 1: DOCTYPE is disallowed when the feature "http://apache.org/xml/features/disallow-doctype-decl" set to true.
! at com.commafeed.backend.feed.parser.FeedParser.parse(FeedParser.java:85)
! at com.commafeed.backend.feed.FeedFetcher.fetch(FeedFetcher.java:46)
! at com.commafeed.frontend.resource.FeedREST.fetchFeedInternal(FeedREST.java:242)
! at com.commafeed.frontend.resource.FeedREST.fetchFeed(FeedREST.java:269)
! at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)