Open belforte opened 2 years ago
related to https://github.com/dmwm/CRABServer/issues/6097 ?
I suspect that this pice of code is not good. Both mark_transferred
and mark_failed
support lists as arguments, as does the call to REST, but here they are called once for every job.
These contribute to POST
for filetransfers
API (that's the omly REST API used by FTS_Transfers.py)
https://github.com/dmwm/CRABServer/blob/b642c5c65ba0c5596f4303edca46118627ed6e16/scripts/task_process/FTS_Transfers.py#L531-L548
fileusertransfer
and filemetadata
API which are the other large source of calls, only from schedd (see above comments) are use by PostJob.
fileusertransfer
from PostJob !
The GET
to filemetadata is from Publisher/PublisherMaster, but is not much and anyhow it is only one call per task, in a way even too much bulk, as the output Json can be hundreds of MB's !
looking at calls to fileusertransfers
(of course calls from PostJobs are always for one job at a time, can't make bulk).
I think that PUT
is one per file when a job completes, POST
is when transfer status is updated, all the GET
are less clear, maybe a relic of CoucheDB code where document had to be read/modified/written-back.
We should be able reduce the number of GET, it is quite odd that we have more GET than POST ! Then there are about 5x more PUT than POST, odd !
GET
calls to fileusertransfers
are here
https://github.com/dmwm/CRABServer/blob/82e032dcebd3f68b4f971d8349dd2b722b92e3cb/src/python/TaskWorker/Actions/PostJob.py#L785-L786
https://github.com/dmwm/CRABServer/blob/82e032dcebd3f68b4f971d8349dd2b722b92e3cb/src/python/TaskWorker/Actions/PostJob.py#L994-L999
POST
calls are here
https://github.com/dmwm/CRABServer/blob/82e032dcebd3f68b4f971d8349dd2b722b92e3cb/src/python/TaskWorker/Actions/PostJob.py#L868
https://github.com/dmwm/CRABServer/blob/82e032dcebd3f68b4f971d8349dd2b722b92e3cb/src/python/TaskWorker/Actions/PostJob.py#L1128
https://github.com/dmwm/CRABServer/blob/82e032dcebd3f68b4f971d8349dd2b722b92e3cb/src/python/TaskWorker/Actions/PostJob.py#L1162
And there is a single place where PUT
is called
https://github.com/dmwm/CRABServer/blob/82e032dcebd3f68b4f971d8349dd2b722b92e3cb/src/python/TaskWorker/Actions/PostJob.py#L828
@belforte During a private meeting you mentioned that we also have a line in publisher where we may want to do this. Could you add it here? thanks! :)
the need to change Publisher is described in https://github.com/dmwm/CRABServer/issues/6097
following screenshot illustrated the amount of HTTP queries to crabserver during peak times. While there are spikes from Publisher as well, bulk is from schedd scripts (PostJobs and/or FTS_transfers). That's too much to feel comfortable. We can't reduce number of SQL transactions unless we somehow redesign, but should at least make sure tht bulk API's are used as muhc as possible.
URL for that dashboard: https://monit-grafana.cern.ch/d/qUVV6S0Gk/crab-timber-pods?orgId=11&from=now-3d&to=now&var-system=crabserver&var-method=All&var-api=All&var-code=All&var-metadataType=All&var-Filters=data.code%7C%3E%7C1&var-Filters=data.api%7C!%3D%7Cinfo&var-copy_of_system=All&var-cluster=cmsweb&var-env=k8s-prod&var-client=CRABSchedd%2Fv3.220107&var-client=CRABPublisher%2Fv3.211111&var-bin=10m&var-dnFilter=.*
filtering a bit I find that:
fileusertransfer
is used only by schedd script and it is mostly GET and way much more PUT than POSTfiletransfer
is almost only POST and used by schedd/publisher as belowfilemetadata
is almost only PUT and almost only from schedd