Open tiborsimko opened 3 years ago
Similarly to Yadage job progress tracking https://github.com/reanahub/reana-workflow-engine-yadage/issues/204, the Serial engine seems to be doing an unnecessary job.
RooFit Serial example:
$ kubectl logs reana-run-batch-0e439059-6141-4918-8c97-e571fee45678-wt2lh workflow-engine | wc -l 3 $ kubectl logs reana-run-batch-0e439059-6141-4918-8c97-e571fee45678-wt2lh job-controller | wc -l 390 $ kubectl logs reana-run-batch-0e439059-6141-4918-8c97-e571fee45678-wt2lh job-controller | grep werkz 2021-10-06 08:39:40,276 | werkzeug | MainThread | INFO | * Running on http://0.0.0.0:5000/ (Press CTRL+C to quit) 2021-10-06 08:39:49,420 | werkzeug | Thread-1 | INFO | 127.0.0.1 - - [06/Oct/2021 08:39:49] "GET /jobs HTTP/1.1" 200 - 2021-10-06 08:39:49,770 | werkzeug | Thread-2 | INFO | 127.0.0.1 - - [06/Oct/2021 08:39:49] "POST /jobs HTTP/1.1" 201 - 2021-10-06 08:39:49,774 | werkzeug | Thread-3 | INFO | 127.0.0.1 - - [06/Oct/2021 08:39:49] "GET /jobs/05fe18c4-924b-4506-88c1-176a2571968e HTTP/1.1" 200 - 2021-10-06 08:39:49,780 | werkzeug | Thread-4 | INFO | 127.0.0.1 - - [06/Oct/2021 08:39:49] "GET /jobs/05fe18c4-924b-4506-88c1-176a2571968e HTTP/1.1" 200 - 2021-10-06 08:39:52,788 | werkzeug | Thread-5 | INFO | 127.0.0.1 - - [06/Oct/2021 08:39:52] "GET /jobs/05fe18c4-924b-4506-88c1-176a2571968e HTTP/1.1" 200 - 2021-10-06 08:39:55,859 | werkzeug | Thread-6 | INFO | 127.0.0.1 - - [06/Oct/2021 08:39:55] "POST /jobs HTTP/1.1" 201 - 2021-10-06 08:39:55,862 | werkzeug | Thread-7 | INFO | 127.0.0.1 - - [06/Oct/2021 08:39:55] "GET /jobs/2a71e77b-23af-40fa-8f56-0fa810bd52d8 HTTP/1.1" 200 - 2021-10-06 08:39:55,865 | werkzeug | Thread-8 | INFO | 127.0.0.1 - - [06/Oct/2021 08:39:55] "GET /jobs/2a71e77b-23af-40fa-8f56-0fa810bd52d8 HTTP/1.1" 200 - 2021-10-06 08:39:58,872 | werkzeug | Thread-9 | INFO | 127.0.0.1 - - [06/Oct/2021 08:39:58] "GET /jobs/2a71e77b-23af-40fa-8f56-0fa810bd52d8 HTTP/1.1" 200 -
Each job's status is queried three times, and even twice in a given second.
Looking at job execution times:
$ reana-client logs -w 0e439059-6141-4918-8c97-e571fee45678 | grep 2021-10 ==> Started: 2021-10-06T08:39:49 ==> Finished: 2021-10-06T08:39:55 ==> Started: 2021-10-06T08:39:55 ==> Finished: 2021-10-06T08:40:01
It seems that first "double-calls" shouldn't be fully necessary.
We may want to optimise down the number of queries done from the workflow engine container to the job controller container.
Similarly to Yadage job progress tracking https://github.com/reanahub/reana-workflow-engine-yadage/issues/204, the Serial engine seems to be doing an unnecessary job.
RooFit Serial example:
Each job's status is queried three times, and even twice in a given second.
Looking at job execution times:
It seems that first "double-calls" shouldn't be fully necessary.
We may want to optimise down the number of queries done from the workflow engine container to the job controller container.