Closed scarrazza closed 2 months ago
I've already seen some overhead, indeed. I have to profile the process a little bit.
Could it be that posting a job to slurm takes some time to start?
Not really, for this simulation job it takes ~5-10s to generate results.
We can split the pipeline in the following steps:
Among these, we can set the time delays at steps number 2, 3 and 8.
Let me try to profile everything, so we can understand better what can be optimized
I've checked and it seems that we are wasting ~25 waiting for the daemon to discover that the job is finished.
The files that we are checking have already been created, but this conditional turns True
only after some time (I don't know why).
I can propose to create a socket in the daemon that listens to some port locally and receives some signal from the launch_script.sh
when the job ends. I can follow this guide to do that.
Alternatively, one can investigate the python signal library, but it seems a more insecure solution.
Let me know what you suggest to try.
I would focus on the conditional. There are other tools like inotify which are fast.
Have you already tried one of these two?
Sure, checkout this for qibocal reports: https://github.com/qiboteam/qibocal/blob/main/serverscripts/qibocal-update-on-change.py
Even if we set the check results to 2s, the full process takes 32s:
This delay should be reduced to the minimum.