aiidateam / aiida-quantumespresso-hp

MIT License
3 stars 0 forks source link

BUG: Wrong restart settings in run_scf_fixed_magnetic #11

Closed MackeEric closed 1 year ago

MackeEric commented 3 years ago

First of all, the run_scf_fixed_magnetic method currently does load the previous SCF workchain, but it does not provide a parent folder so actually the calculation is not being restarted from any file. To fix this, the following line needs to be added:

    inputs.pw.parent_folder = previous_workchain.outputs.remote_folder

Furthermore, some of the input parameters are wrong and would not allow for a correct restart (see pw.x docs for the resons). Changes that must be implemented include.

    inputs.pw.parameters['CONTROL']['restart_mode'] = 'from_scratch'
    inputs.pw.parameters['ELECTRONS']['startingpot'] = 'file'

I found that setting the startingwfc parameter to 'file' often leads to problems and is not really necessary if the potential is provided.

sphuber commented 3 years ago

So which input parameters are actually incorrect? I don't see startingwfc being set anywhere?

MackeEric commented 3 years ago

So the ['restart_mode'] is currently set to restart which sounds correct but actually isn't (see pw.x input for the reason). ['startingpot'] = 'file' and ['startingwfc'] = 'file' are not yet in the code, but at least the first of the two must be added to the parameters in order to turn this calculation into an actual restart calculation.

sphuber commented 3 years ago

So the ['restart_mode'] is currently set to restart which sounds correct but actually isn't (see pw.x input for the reason).

Do you mean the following?

'restart' : From previous interrupted run. Use this switch only if you want to continue, using the same number of processors and parallelization, an interrupted calculation. Do not use to start a new one, or to perform a non-scf calculations. Works only if the calculation was cleanly stopped using variable max_seconds, or by user request with an "exit file" (i.e.: create a file "prefix".EXIT, in directory "outdir"; see variables prefix, outdir). Overrides startingwfc and startingpot.

It seems to say that you can use this flag to perform a restart. If this is wrong, then there is also a problem with PwBaseWorkChain which also uses this to perform a restart and I am pretty sure that this works as intended. We have ran many restarts with this and if this doesn't work we may have a big problem.

MackeEric commented 3 years ago

Exactly, I would emphasize this:

Works only if the calculation was cleanly stopped using variable max_seconds, or by user request with an "exit file".

Typically, our prior "smearing" SCF calculation is neither being stopped by max_seconds nor is there any kind of exit file produced. Instead, the program exists regularly and the wavefunctions and charge densities are written out (normally). I discovered this while doing manual tests, after seeing that many calculations which had converged with smearing often required more than 200 steps in the fixed scf to converge (or did not converge at all). This does not make any sense and it shouldn't take pw.x more than just a handful of iterations to converge again. I can set up some test inputs for you to demonstrate the problem. I don't know since when ['restart_mode'] behaves that way, but it might be a recent change to QE

MackeEric commented 3 years ago

Okay, forget about it. I've just rechecked all these settings and it seems as if inputs.pw.parameters['CONTROL']['restart_mode'] = 'restart' is indeed equal to setting

inputs.pw.parameters['ELECTRONS']['startingpot'] = 'file'
inputs.pw.parameters['ELECTRONS']['startingwfc'] = 'file'

It might have been a different problem/error that led me to believe there is a problem with the restart_mode. However, the line

inputs.pw.parent_folder = previous_workchain.outputs.remote_folder

was missing in the workchain and must be added.

bastonero commented 1 year ago

Fixed in #32