ravenscroftj / freshrss-flaresolverr-extension

FreshRSS plugin that provides cloudflare puzzle solving via flaresolverr
GNU Affero General Public License v3.0
22 stars 4 forks source link

[Bug] Fanfiction.net feed seems to be not working #1

Open SMylk opened 1 year ago

SMylk commented 1 year ago

Getting no result on: https://www.fanfiction.net/atom/l/?&cid1=224&r=103&s=1

ProjectMoon commented 1 year ago

This is likely caused by FlareSolverr returning the feed in what appears to be an encoded document inside HTML. Some feeds, like the substack one in the readme of this repository, will return a proper <rss> tag embedded in the HTML response. Others, like your example (or https://gateworld.net/feed, which is what I was trying to add) return &lt;rss&gt; in the HTML response from FlareSolverr. The extension is unable to recognize this.

ProjectMoon commented 1 year ago

There was a returnRawHtml parameter in FlareSolverr, but it appears to have been removed and not reimplemented for some reason. Ultimately, this appears to be a problem with FlareSolverr. I'm not sure what causes it to render some feeds as raw XML embedded in HTML vs rendering the XML as HTML entities (&lt; etc). This can be worked around in the extension by attempting to load the inner text of the <body> from FlareSolverr and running html_entity_decode on it (https://www.php.net/manual/en/function.html-entity-decode.php). Then passing the result of that off to the existing code that loads the feed.

ProjectMoon commented 1 year ago

Submitted a PR that might help with this issue for you (it did for me).

ravenscroftj commented 1 year ago

PR merged - maybe try downloading the plugin again and trying again now?

SMylk commented 1 year ago

Tried it, doesn't seem to be working, for https://gateworld.net/feed I get back: <?xml version="1.0"?>

For https://www.fanfiction.net/atom/l/?&cid1=8&r=103&s=1 <?xml version="1.0"?>

Have the latest flaresolverr and I see it in the logs that it fetches the content.

ProjectMoon commented 1 year ago

What error do you get from PHP?

ProjectMoon commented 1 year ago

Hmm problem in this case is probably that the fanfiction feed is an Atom feed, not RSS. The extension only looks for the <rss> element.

SMylk commented 1 year ago

Yes, you got it, the fanfiction feed appears to be atom. for gateworld: A feed could not be found at http://192.168.1.200:1180/api/cloudsolver.php?feed=https://www.gateworld.net/feed/; the status code is 200 and content-type is application/xml

ProjectMoon commented 1 year ago

Yes, you got it, the fanfiction feed appears to be atom. for gateworld: A feed could not be found at http://192.168.1.200:1180/api/cloudsolver.php?feed=https://www.gateworld.net/feed/; the status code is 200 and content-type is application/xml

For Gateworld you will need to use the new viahtml=1 parameter. For some reason, the Gateworld feed is returned as XML embedded in HTML, and loading via HTML fixes it.

But for the atom feed, the extension will need another update. Easiest solution is to either fall back to the other element if the other isn't present, or provide yet another GET querystring parameter to tell it what type of feed to try and load.

SMylk commented 1 year ago

Nice, viahtml is working.