dmwm / DBS

CMS Dataset Bookkeeping Service
Apache License 2.0
7 stars 21 forks source link

DBS access frequency control #600

Open yuyiguo opened 5 years ago

yuyiguo commented 5 years ago

@vkuznet @belforte @amaltaro @bbockelm DBS servers was frequently crushed for a period of time as described in https://github.com/dmwm/DBS/issues/595. The root cause was a client made hundreds even thousands of threads against DBS. The risk behavior affected entire DBS clients. In order to prevent this happens again, We need to put a limit on how many calls a client can make per minute? Should the limit based on DN? Should we distinguish between individual user vs production system? Where should the limit be front end or backend?

vkuznet commented 5 years ago

I posted my response in another ticket, but for completeness it should go here:

For apache we can use mod_evasive or mod_throttle, while for DBS backend I didn't find explicitly cherrypy solution and we should probably write our own. But for Flask we can use http://flask.pocoo.org/snippets/70/

If we'll need to write our own throttling for DBS (WMCore in general) it should track clients based on DN's (to allow catch bulk requests from distributed clients), we can whitelist production system DNs.

yuyiguo commented 5 years ago

@vkuznet If we can do this in the front end, it would be better. All the requests get into back ends, it could be a bit late to stop it Yuyi

bbockelm commented 5 years ago

You probably don't really want a rate limiter but rather a concurrency limiter -- no one cares if you do many cheap queries.

We do this in CRAB to prevent users from hitting the service with many queries in parallel:

https://github.com/dmwm/CRABServer/commit/7601be8960452373784000d4e29e368ff4774cbc

It happens on the backend since the resources there are more precious...

belforte commented 5 years ago

amazing how much more fancy CRAB is than I never realized (nor suspected) ! sounds like nobody ever gets throttled in there, but surely it is good to know that it does not create any harm :-)

vkuznet commented 5 years ago

This looks perfect, @amaltaro, Alan can we put this code into WMCore? The only downside that it appears in CRAB first rather then WMCore. If we put it into WMCore we may patch CRAB to load it from WMCore instead of CRABInterface. @belforte, Stefano will you agree on this, i.e. put this code into WMCore and modify CRABInterface to load it from there?

If we'll agree to do it we can ask Yuyi to decorate DBS APIs and test it.

On 0, Brian P Bockelman notifications@github.com wrote:

You probably don't really want a rate limiter but rather a concurrency limiter -- no one cares if you do many cheap queries.

We do this in CRAB to prevent users from hitting the service with many queries in parallel:

https://github.com/dmwm/CRABServer/commit/7601be8960452373784000d4e29e368ff4774cbc

It happens on the backend since the resources there are more precious...

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/dmwm/DBS/issues/600#issuecomment-484132333

belforte commented 5 years ago

yes I agree to move this from CRAB to WMCore repo.

vkuznet commented 5 years ago

Ok, I ported code from CRABServer into WMCore, see https://github.com/dmwm/WMCore/pull/9158 Once merged I'll make PR for CRABServer.

@yuyiguo , Yuyi you can look at unit test I provided (https://github.com/dmwm/WMCore/blob/7b2ec2a690d03e8066dcc56d182070a0c451c09b/test/python/Utils_t/Throttled_t.py) and implement something similar for DBS. Basically what you need to do is to add an additional decorator to your API method which you'd like to throttle, e.g.

@global_user_throttle.make_throttled()
def throttled_function():
    "Test function for throttled"
    # put here any logic you want
vkuznet commented 5 years ago

@belforte , Stefano, here is relevant PR for CRABServer: https://github.com/dmwm/CRABServer/pull/5880