Open oxisto opened 3 weeks ago
@tschmidtb51 We did not observe any kind of rate limiting on any provider yet. I parsed all RedHat CSAF files (50448 currently) with high concurrency, multiple times. No issues. If we are supposed to do rate limiting, how exactly shall we implement that? Requests per host per second/minute/hour? And what would be the expected behavior when reaching the limit? Blocking/Suspending calls until we have quota left? Failing? Both, depending on timeout?
Interesting. It looks like they loosen the conditions on that.
Anyway, there are others that are more strict. So, I suggest to look into what ISDuBA does and/or the csaf_downloader. If I remember correctly, they use a limit on the requests per second and a limit on simultaneous requests.
So the idea could be to:
When reaching the (internal, through config values defined) limit, I guess the download should be paused until new capacity is free.
Interesting. It looks like they loosen the conditions on that.
Anyway, there are others that are more strict. So, I suggest to look into what ISDuBA does and/or the csaf_downloader. If I remember correctly, they use a limit on the requests per second and a limit on simultaneous requests.
So the idea could be to:
- use a queue and request per source only as many files that the limit is still considered or
- slow down requests according to the limit.
When reaching the (internal, through config values defined) limit, I guess the download should be paused until new capacity is free.
@tschmidtb51 do you have an example of a rate limited server? Especially on how they implement this? We did not really observe this in the wild (yet).
Try siemens.com and tibco.com
Try siemens.com and tibco.com
Unfortunately, we cannot even access the provider meta-data of www.tibco.com. It works in the browser (https://www.tibco.com/.well-known/csaf/provider-metadata.json), but accessing it via this library results in a 403. Also a simple curl results in 403. Setting a browser-like user agent also does not help...
Try siemens.com and tibco.com
Unfortunately, we cannot even access the provider meta-data of www.tibco.com. It works in the browser (https://www.tibco.com/.well-known/csaf/provider-metadata.json), but accessing it via this library results in a 403. Also a simple curl results in 403. Setting a browser-like user agent also does not help...
I remember that issue... I don't have a solution yet....
It seems that some providers, e.g. RedHat do some rate limiting and also limit concurrent connections.