Open GVigne opened 2 years ago
Yeah I'm not too sure what to do. Maybe just time the first thread? Or print a warning that timer outputs are unreliable when running threaded? Or raise it at TimerOutputs?
Well, we could always remove the check for the threadid, and we would be merging all timers into the main one. But I think @mfherbst said there were a few threading updates in Julia 1.8, and that multi-threading in DFTK would probably have to be updated to reflect that.
Hi, I ran into some strange behaviors when taking a look at the timer when running multi-threaded computations. More specifically, the number of application of the local and kinetic terms are off. For example, for a mono thread code, here is the timer I get:
local+kinetic | 2.35k | 83.2s | 71.9% | 35.5ms | 1.79MiB | 0.0% | 800B
However, running the same code with 8 threads yield:local+kinetic | 219 | 19.8s | 31.2% | 90.4ms | 3.24MiB | 0.0% | 15.1KiB
The other values in the timer are more or less the same, so it really happens just for this term. This is because when applying the Hamiltonian, we build a timer for every thread, but only merge the one on the master thread with the DFTK timer. Essentially, when running a multi-threaded code, we discard all the temporary timers except one.This raises a more broad question, which we discussed a bit with @mfherbst : should the timer be a debugging tool, or one meant for the end-user to see what happened during the computations and why? If it is meant for end user, we probably shouldn't keep the timer this way, as by looking at the timer it feels like running the code with more than one thread reduces the number of computations done.