Closed somini closed 5 years ago
I have the same issue. But these feeds work.
The problem is related to Cloudflare, this is probably their anti-bot system. The first HTTP request to discover the website/icon is working but not the second one to fetch the feed.
Looks like Cloudflare is doing some rate limiting for Hacker News website, if you make more than one HTTP request in one second, they will block the other one.
You could try with curl:
curl -I https://news.ycombinator.com/rss && curl -I https://news.ycombinator.com/rss
HTTP/1.1 200 OK
Date: Wed, 09 May 2018 04:53:49 GMT
Content-Type: application/rss+xml
Connection: keep-alive
Set-Cookie: __cfduid=d361ca5058ea776bc63102d05db4f123f1525841629; expires=Thu, 09-May-19 04:53:49 GMT; path=/; domain=.ycombinator.com; HttpOnly
Cache-Control: private; max-age=0
X-Frame-Options: DENY
X-Content-Type-Options: nosniff
X-XSS-Protection: 1; mode=block
Referrer-Policy: origin
Strict-Transport-Security: max-age=31556900
Content-Security-Policy: default-src 'self'; script-src 'self' 'unsafe-inline' https://www.google.com/recaptcha/ https://www.gstatic.com/recaptcha/ https://cdnjs.cloudflare.com/; frame-src 'self' https://www.google.com/recaptcha/; style-src 'self' 'unsafe-inline'
Expect-CT: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
Server: cloudflare
CF-RAY: 4181908598a23b20-XXX
HTTP/1.1 503 Service Temporarily Unavailable
Date: Wed, 09 May 2018 04:53:49 GMT
Content-Type: text/html
Content-Length: 537
Connection: keep-alive
Set-Cookie: __cfduid=dc270715d50b8423d7a9be5269c3acd6e1525841629; expires=Thu, 09-May-19 04:53:49 GMT; path=/; domain=.ycombinator.com; HttpOnly
ETag: "5a2ce78d-219"
Expect-CT: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
Server: cloudflare
CF-RAY: 418190871cdb3b38-XXX
But if you wait one second between each request this is working:
$ curl -I https://news.ycombinator.com/rss && sleep 1 && curl -I https://news.ycombinator.com/rss
HTTP/1.1 200 OK
Date: Wed, 09 May 2018 04:55:22 GMT
Content-Type: application/rss+xml
Connection: keep-alive
Set-Cookie: __cfduid=d75e2370c388bec136c7925edfb226e8d1525841721; expires=Thu, 09-May-19 04:55:21 GMT; path=/; domain=.ycombinator.com; HttpOnly
Cache-Control: private; max-age=0
X-Frame-Options: DENY
X-Content-Type-Options: nosniff
X-XSS-Protection: 1; mode=block
Referrer-Policy: origin
Strict-Transport-Security: max-age=31556900
Content-Security-Policy: default-src 'self'; script-src 'self' 'unsafe-inline' https://www.google.com/recaptcha/ https://www.gstatic.com/recaptcha/ https://cdnjs.cloudflare.com/; frame-src 'self' https://www.google.com/recaptcha/; style-src 'self' 'unsafe-inline'
Expect-CT: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
Server: cloudflare
CF-RAY: 418192c4cabd3b20-XXX
HTTP/1.1 200 OK
Date: Wed, 09 May 2018 04:55:23 GMT
Content-Type: application/rss+xml
Connection: keep-alive
Set-Cookie: __cfduid=de951e2f9a929c59df8ff36796932437e1525841723; expires=Thu, 09-May-19 04:55:23 GMT; path=/; domain=.ycombinator.com; HttpOnly
Cache-Control: private; max-age=0
X-Frame-Options: DENY
X-Content-Type-Options: nosniff
X-XSS-Protection: 1; mode=block
Referrer-Policy: origin
Strict-Transport-Security: max-age=31556900
Content-Security-Policy: default-src 'self'; script-src 'self' 'unsafe-inline' https://www.google.com/recaptcha/ https://www.gstatic.com/recaptcha/ https://cdnjs.cloudflare.com/; frame-src 'self' https://www.google.com/recaptcha/; style-src 'self' 'unsafe-inline'
Expect-CT: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
Server: cloudflare
CF-RAY: 418192d31c5a3b50-XXX
Hello! I also have a similar problem but I wasn't able to debug it. Feed URL: https://www.rts.ch/la-1ere/programmes/les-beaux-parleurs/podcast/?flux=rss
At first I had imported it with OPML so it was in my subscriptions but Refresh said "error code 403" (which does not correspond with what I get when opening directly the feed). Then I removed the imported subscription, and when creating a new one with this URL I get the error "Unable to find any subscription".
@w2ak Looks like it's an invalid RSS feed. w3 validator
@w2ak Your issue is not exactly the same as the original one. They, I mean Akamai are blocking Miniflux based on headers sent by the HTTP client. You can simulate this behavior with curl:
curl -v -H "User-Agent: Mozilla/5.0 (compatible; Miniflux/2.0.7; +https://miniflux.net)" -H "Accept: */*" "https://www.rts.ch/la-1ere/programmes/les-beaux-parleurs/podcast/?flux=rss"
* Trying 104.64.112.165...
* TCP_NODELAY set
* Connected to www.rts.ch (104.64.112.165) port 443 (#0)
* TLS 1.2 connection using TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
* Server certificate: *.rts.ch
* Server certificate: DigiCert SHA2 High Assurance Server CA
* Server certificate: DigiCert High Assurance EV Root CA
> GET /la-1ere/programmes/les-beaux-parleurs/podcast/?flux=rss HTTP/1.1
> Host: www.rts.ch
> User-Agent: Mozilla/5.0 (compatible; Miniflux/2.0.7; +https://miniflux.net)
> Accept: */*
>
< HTTP/1.1 403 Forbidden
< Server: AkamaiGHost
< Mime-Version: 1.0
< Content-Type: text/html
< Content-Length: 338
< Expires: Fri, 18 May 2018 03:44:09 GMT
< Date: Fri, 18 May 2018 03:44:09 GMT
< Connection: close
<
<HTML><HEAD>
<TITLE>Access Denied</TITLE>
</HEAD><BODY>
<H1>Access Denied</H1>
You don't have permission to access "http://www.rts.ch/la-1ere/programmes/les-beaux-parleurs/podcast/?" on this server.<P>
Reference #18.d5eafea5.1526615049.ae5d7d6
</BODY>
</HTML>
Removing one of the header User-Agent
or Accept
make it works. But these headers are valid. The web is a nasty place.
FWIW, another workaround is to use https://github.com/edavis/hnrss, self-hosted or via hnrss.org. I use it to get more granular control of HN feeds, but it might be worth seeing how they avoid rate-limiting.
Cloudflare strikes again. Thanks for the debug @fguillot .
I'm running Miniflux on a RPi3 for myself, so I wanted to avoid having to install many dependencies on it. At least it's not PHP...
It seems hnrss was rewritten in Go: https://github.com/edavis/go-hnrss
I forked it over at Gitlab and setup the CI so that binaries can be automatically built. Anyone can reuse the binaries, you can confirm the only commits I did was configuring the CI. https://gitlab.com/somini/go-hnrss I run this on a separate port, configure nginx to proxy it and subscribe to the feeds as usual.
This is good enough for me, so @fguillot can close this, or add it on the docs.
https://news.ycombinator.com/news, feed URL: https://news.ycombinator.com/rss
This gives a
Unable to fetch feed (statusCode=503)
error code.I tested by putting the feed in a OPML file and importing that and it succeed.