RSS-Bridge / rss-bridge

The RSS feed for websites missing it
https://rss-bridge.org/bridge01/
The Unlicense
7.24k stars 1.03k forks source link

Bandcamp: bridge returned error 429 #1959

Open thezeroalpha opened 3 years ago

thezeroalpha commented 3 years ago

I have a number of Bandcamp artist feeds (around 40), with the "articles are" option set to "releases, new one when tracklist changes". When I refresh the feeds, some of them fail with error 429 (HTTP error for 'too many requests'). Is there any way to fix this?

I am on the latest commit of RSS-bridge (ea289a0).

em92 commented 3 years ago

Ping @Roliga as one of the maintainers of this bridge.

thezeroalpha commented 3 years ago

@Roliga and @sebsauvage, you are both listed as maintainers for the Bandcamp bridge; are you still interested in maintaining the bridge or should I look into the issues myself?

em92 commented 3 years ago

@thezeroalpha, I contacted Roliga via IRC.

(22:13:54) Roliga: Hey em92_! Thanks for letting me know. I've been quite flooded with other things recently but I'll see if I can do something about this tonight!

If he does not answer in one day, I suggest to look into this issue yourself.

em92 commented 3 years ago

Offtopic: https://github.com/RSS-Bridge/rss-bridge/issues/870

Roliga commented 3 years ago

Hey sorry for taking so long to look into this.

The problem here is of course hitting the rate limit of bandcamp, and the only thing the bridge itself could do would be to retry failed requests a few times and/or just waiting a bit between each request. I think it could even be nice to have this kind of retry/rate limiting be part of RSS Bridge itself, so other bridges could benefit from it too.

You can also get by this by tools around RSS Bridge:

I'll look into implementing some basic retry loop into this bridge, and depending on what others think we could maybe try adding something more global for all bridges to handle rate limiting.

thezeroalpha commented 3 years ago

Thanks for responding @Roliga! Of course, no worries. I thought that might be the problem, but I couldn't figure out a way around it. I don't really want to limit concurrent connections to RSS-Bridge in general, because I use it for other websites too and the Bandcamp bridge is the only one I've had issues with. It might be good if RSS-Bridge either allowed a limiting option for bridges/domains, or perhaps a retry loop like you said would work well enough.

Something else that might be useful: NewPipe (originally a Youtube alternative for Android) recently merged a PR that adds Bandcamp streaming support, it could be worth seeing whether they're doing anything special in relation to rate limiting. Though it could be that for their purpose, they simply don't send as many concurrent requests.

mightymt commented 3 years ago

I can also think of another way to possibly reduce the number of requests here. This would however only make a difference if type “Individual tracks” is used for option “Articles are”:

I think it might be a good idea to cache the JSON data fetched at BandcampBridge.php#L184 using e.g. getSimpleHTMLDOMCached (in case this function works properly with non-HTML content). It's probably unnecessary to fetch the details for each individual track every time the bridge data is built.

Not sure whether that would fix the problem for @thezeroalpha because they didn't mention which “Articles are” output type they are using.

thezeroalpha commented 3 years ago

Not sure whether that would fix the problem for @thezeroalpha because they didn't mention which “Articles are” output type they are using.

Thank you for pointing that out, I changed the issue description to include this. I have it set to "releases, new one when track list changes", for all band feeds.

mightymt commented 3 years ago

@thezeroalpha Ok, then my suggested fix won't apply to you.

Maybe you could also try to increase the cache timeout, provided you have custom_timeout enabled in your config.ini.php. The default value for this bridge seems rather low at 600 seconds (10 minutes).

mrzool commented 2 years ago

I'm getting this error with all my Bandcamp feeds and I'm not super clear about what should I do to fix it.

I'm using this public instance of rss-bridge.

Grateful for any pointer.

Screen Shot 2021-11-19 at 12 38 53
dvikan commented 2 years ago

I think this is a naive http server rate limiting. Maybe sleep(1) between requests?

ghost commented 1 year ago

I'm having this same issue with all my bandcamp bridges (on a self-hosted private instance). If I use my client to filter out all the error notifications, can I assume will everything work, just slower? I don't mind waiting an extra hour or two for a successful scrape.

em92 commented 1 year ago

I use my client to filter out all the error notifications

You can add this to config.ini.php for error message not to appear in feed:

[error]
output = "http"

I don't mind waiting an extra hour or two for a successful scrape.

Have you tried to decrease feed fetch period in your rss client?

dvikan commented 1 year ago

this might be possible to tackle now with improved cache api