nipreps / mriqc

Automated Quality Control and visual reports for Quality Assessment of structural (T1w, T2w) and functional MRI of the brain
http://mriqc.readthedocs.io
Apache License 2.0
294 stars 130 forks source link

possible error in "Elapsed time" in log files #861

Closed jbwexler closed 2 years ago

jbwexler commented 3 years ago

I ran openneuro dataset ds000172 with mriqc-0.15.1 on 2/15/20 on TACC. There are thirteen subjects, and the longest running subject ran for 4379s, according to the "Elapsed time" at the end of the log files. However, according to TACC's records, this job ran for 14129s. They sent me a few screenshots as evidence, which I've attached (the x-axis in the graph is in hours). Generally when I compare run times given by the log file with TACC's records, they are pretty similar, so this is unusual. One thing to note from the graph screenshot is that the bulk of the computation happened in the last hour of the run. Screen Shot 2020-10-01 at 6 00 21 PM Screen Shot 2020-10-01 at 6 06 16 PM

oesteban commented 2 years ago

Probably both pieces of information are true - although they are reporting different issues. The "elapsed time" reported in the log files is basically the subtraction of the timestamp when the logic of the workflow execution if finished and when it is started.

TACC probably retrieved this information from SLURM.

So there seems to be some issue with nipype staging processes of in MRIQC's workflow setup that stalls computers for a very long while. Many people complain that the workflow gets stalled right after it starts - I wonder if those users just didn't have the patience to wait for as long as you did (e.g., #825).

Definitely a worrisome bug to chase after.

oesteban commented 2 years ago

It seems the "Elapsed time" reporting was dropped from 0.16 on. Since the elapsed time is better estimated by external tools, this issue can be closed.