csaf-sbom / kotlin-csaf

A Kotlin implementation of the CSAF standard.
Apache License 2.0
3 stars 0 forks source link

Support rate and concurrency limit #65

Open oxisto opened 3 weeks ago

oxisto commented 3 weeks ago

It seems that some providers, e.g. RedHat do some rate limiting and also limit concurrent connections.

milux commented 3 weeks ago

@tschmidtb51 We did not observe any kind of rate limiting on any provider yet. I parsed all RedHat CSAF files (50448 currently) with high concurrency, multiple times. No issues. If we are supposed to do rate limiting, how exactly shall we implement that? Requests per host per second/minute/hour? And what would be the expected behavior when reaching the limit? Blocking/Suspending calls until we have quota left? Failing? Both, depending on timeout?

tschmidtb51 commented 2 weeks ago

Interesting. It looks like they loosen the conditions on that.

Anyway, there are others that are more strict. So, I suggest to look into what ISDuBA does and/or the csaf_downloader. If I remember correctly, they use a limit on the requests per second and a limit on simultaneous requests.

So the idea could be to:

  1. use a queue and request per source only as many files that the limit is still considered or
  2. slow down requests according to the limit.

When reaching the (internal, through config values defined) limit, I guess the download should be paused until new capacity is free.

oxisto commented 1 week ago

Interesting. It looks like they loosen the conditions on that.

Anyway, there are others that are more strict. So, I suggest to look into what ISDuBA does and/or the csaf_downloader. If I remember correctly, they use a limit on the requests per second and a limit on simultaneous requests.

So the idea could be to:

  1. use a queue and request per source only as many files that the limit is still considered or
  2. slow down requests according to the limit.

When reaching the (internal, through config values defined) limit, I guess the download should be paused until new capacity is free.

@tschmidtb51 do you have an example of a rate limited server? Especially on how they implement this? We did not really observe this in the wild (yet).

tschmidtb51 commented 1 week ago

Try siemens.com and tibco.com

oxisto commented 1 week ago

Try siemens.com and tibco.com

Unfortunately, we cannot even access the provider meta-data of www.tibco.com. It works in the browser (https://www.tibco.com/.well-known/csaf/provider-metadata.json), but accessing it via this library results in a 403. Also a simple curl results in 403. Setting a browser-like user agent also does not help...

tschmidtb51 commented 1 week ago

Try siemens.com and tibco.com

Unfortunately, we cannot even access the provider meta-data of www.tibco.com. It works in the browser (https://www.tibco.com/.well-known/csaf/provider-metadata.json), but accessing it via this library results in a 403. Also a simple curl results in 403. Setting a browser-like user agent also does not help...

I remember that issue... I don't have a solution yet....