Open vlimant opened 6 years ago
Could you provide information:
If this is a load related error, we need to find out where is the bottleneck.
you can try to create a transfer of any dataset to 15 sites, and disapprove all of them at once. This will fail, I guarantee it
I just tried to approve all pending for https://cmsweb.cern.ch/phedex/prod/Request::View?request=1097871
and got
""" Apologies, looks like we have an internal server error, details of which below. If the problem persists, please submit a bug report.
Error time=2017-09-11 10:23:56 UTC id=4525e0d67b4c2004d6ce584fddf59b78 """
several times
Hi, thanks for the timestamp, it was helpful. I don't have approval rights and would not be able to try myself.
Anyway, I tracked this down to this error in the server log:
2017-09-11 10:23:56 UTC: error: id=4525e0d67b4c2004d6ce584fddf59b78 Error evaluating client identity at /data/srv/beHG1707d/sw/slc7_amd64_gcc630/cms/PHEDEX-datasvc/2.3.24/perl_lib/PHEDEX/Web/API/UpdateRequest.pm line 116.\n at /data/srv/state/phedex/htdocs/WebSite/access25 line 4356.
Investigating further I found this to be a sort of safety feature introduced 5 years ago: https://github.com/dmwm/PHEDEX/commit/80fee7f98154ead19c4c55a2edabe8151d04a208 which prevents ( on the authentication level) the approval of more than 10 node requests at a time.
In principle this makes sense, because once approved, it is not possible to undo the deletions!
Since the request in question is fully approved now, I assume you eventually succeeded after trying a few times. If this is the case (please confirm) , I see several options: A) leave this feature as a precaution, but document it and produce a meaningful error. B) Increase the limit to whatever seems practical to you C) disable the feature
would be great to understand why this was introduced. experiementally the limit is not at 10, but 2-3ish. this is very unpractical
when trying to approve anything over multiple sites, the "submit" often (like 99%) returns with an error. I think it is time to fix this