oknozor / musicbrainz_rs

A wrapper around the musicbrainz API
MIT License
41 stars 18 forks source link

Implement retry on 503 server response, using `Retry-After` Header value #13

Closed oknozor closed 3 years ago

oknozor commented 5 years ago

Rate limiting cause integration tests to fail randomly, for now a quick'n dirty fix has been made with thread sleeps on every test.

Adding a proper user agent header as described in MusicBrainz rate limiting documentation might solve the issue.

This still need to be figured out but implementing a retry on HTTP 503 responses might also be a good idea.

Boscop commented 4 years ago

@oknozor From the wording it seems that only anonymous user-agents are subject to throttling (?) Have you tried cycling different browser user-agents?

oknozor commented 4 years ago

At the time I opened this issue I had not yet written the user agent feature, so I didn't thought about doing it this way. That's a great idea, I will try that when I have some spare time. Or maybe you would like to submit a PR ?

ritiek commented 3 years ago

Should we simply include headers in every request by default now that we have the user agent feature? In the docs under #User-Agent, it mentions:

For "python-musicbrainz/0.7.3": we allow through (on average) 50 requests per second, and decline the rest (though recently this has not been hit).

I glanced at the source code to python-musicbrainz2 and it indeed sets the user agent by default to python-musicbrainz/$version.

So, similarly perhaps we could default to something like musicbrainz_rs/$version?


That said, but now in python-musicbrainz3 (which aims to be a successor to python-musicbrainz2 and also referred as python-musicbrainz-ngs), it forces the client user to set the user agent themselves.

Also from the docs here, it mentions:

Our client libraries now support functions for setting the User-Agent string. If you are using one of our libraries, you will need to add a call to set the User-Agent string via the library and your application should work again.

Is this maybe a hint that all future official libraries are now supposed to force users to set their own user agent? In this case, we should probably go this route if being an official rust client is on our roadmap.

oknozor commented 3 years ago

I guess it's worth trying to set the user agent in the integration tests and see if it does not hit rate limiting, but I am afraid it will still fails due to IP blocking/throttling rules.

Being an official rust client is definitely on our roadmap but I am not sure setting the user agent should be mandatory. I think we shall discuss this further with musicbrainz people.

For now we can simply try to fix the test suite, I will ask on the IRC if we should use a default musicbrainz_rs/$version header or force the user to set is own user agent string. I'll get back yo you when I know more.

ritiek commented 3 years ago

I will ask on the IRC if we should use a default musicbrainz_rs/$version header or force the user to set is own user agent string. I'll get back yo you when I know more.

Alright!

I guess it's worth trying to set the user agent in the integration tests and see if it does not hit rate limiting, but I am afraid it will still fails due to IP blocking/throttling rules.

I just tried this out locally and also on my fork in ritiek/musicbrainz_rs#2 (to see how GitHub actions behaves) by adding a user-agent and removing the 1s sleep from all tests. You are right about tests still failing due to IP throttling.

but implementing a retry on HTTP 503 responses might also be a good idea.

This could be a way out maybe. However, when MusicBrainz hits the rate-limit, they seem to only return the HTTP 503 error code, but nothing else useful in the response content such as how much time to wait before making the next query that wouldn't be rate-limited. We probably may have to sleep for small random durations before making a retry query if we are to implement this I think, which I'm not sure MusicBrainz people will be happy with.

oknozor commented 3 years ago

No response from musicbrainz people so far, I guess we will stick with the default header for now and let users set the user-agent if needed. Anyway I am not comfortable with making the user agent header mandatory.

As for the retry on 503 response, this is actually a good idea. I just checked, musicbrainz server send a Retry-After header so this would be straight forward to implement.

ritiek commented 3 years ago

As for the retry on 503 response, this is actually a good idea. I just checked, musicbrainz server send a Retry-After header so this would be straight forward to implement.

Wow, that's cool! I shouldn't have overlooked the response headers.