cms-PdmV / cmsPdmV

CERN CMS McM repository
4 stars 10 forks source link

cleanup list of "force-complete" #1140

Closed vlimant closed 1 week ago

vlimant commented 2 months ago

Looks like the list https://cms-pdmv-prod.web.cern.ch/mcm/restapi/requests/forcecomplete does not get updated once a request does get acted upon and goes into "done" status. This issue is to review what is the mechanism under that list and how to implement a cleanup mechanism (if needed) so that the list does not grow forever big.

The desired behaviour is that an entry in https://cms-pdmv-prod.web.cern.ch/mcm/restapi/requests/forcecomplete is set by operating in McM, gets picked up by unified/ops (https://gitlab.cern.ch/CMSProductionReprocessing/WmAgentScripts/-/blob/master/Unified/completor.py?ref_type=heads#L87)

lmoureaux commented 2 months ago

Requests are removed from the forcecomplete upon set_status() once they reach status new or done, here:

https://github.com/cms-PdmV/cmsPdmV/blob/9d10daa2dc3d2801d7a6041147e24bc3b3fbc104/mcm/json_layer/request.py#L173-L174

set_status is called among others from inspect (inspect -> inspect_submitted -> set_status) once a submitted request reaches announced or normal-archived. inspect is normally called automatically twice a day by the McM Request Inspect and Flow Jenkins job.

However in the latest run it seems that there were quite a few authentication errors. @ggonzr is this line supposed to work with the new SSO?

https://github.com/cms-PdmV/cmsPdmV/blob/9d10daa2dc3d2801d7a6041147e24bc3b3fbc104/mcm/automatic_scripts/flow_all.py#L10-L11

ggonzr commented 2 months ago

Yes, it works with the latest SSO. The cookie is requested by the auth-get-sso-cookie package and its path is sent to the downstream process. However, it seems the executions #829, #827 had issues using the requested cookie. Other runs like #828 and #830 finished properly. I will check this and see if I can reproduce the issue again.

lmoureaux commented 2 months ago

Thanks Geovanny!

ggonzr commented 1 month ago

This rare issue seems to be happening due to a problem with the script mechanisms to re-validate the session. The following PR solves this situation: #20

vlimant commented 2 weeks ago

one thing that can be done for now is to remove from https://cms-pdmv-prod.web.cern.ch/mcm/restapi/requests/forcecomplete anything that is already in "done" status. is there an API to do this one by one ? otherwise we need a direct edit in the DB.

then we take care of the remaining ones (likely that will appear in the "dead" limbo

lmoureaux commented 2 weeks ago

lists/update lets administrators reset the whole list (similar to a manual db update).

vlimant commented 2 weeks ago

what's the format for the PUT in this case? updating that list is a thing tech support can also do,

lmoureaux commented 2 weeks ago

what's the format for the PUT in this case?

The same format as in the database (I don't know offhand what it looks like).

updating that list is a thing tech support can also do,

Only @ggonzr knows the database password.

vlimant commented 1 week ago

hit #1154 on the way