aiidateam / aiida-quantumespresso

The official AiiDA plugin for Quantum ESPRESSO
https://aiida-quantumespresso.readthedocs.io
Other
52 stars 77 forks source link

Too many k-points in PDOS calculation #893

Open giovannipizzi opened 1 year ago

giovannipizzi commented 1 year ago

When I run a PDOS calculation of bulk Platinum (FCC) with the QE app, using a precise protocol and overriding to use the SSSP Precision 1.2 pseudopotentials, the PDOS calculation fails with the error "Too many k-points". I guess this is a limitation of QE. However, I think we should implement in this case some error handlers to just split the calculation in chunks and then merge it back (or even know already that this will happen, and run it in chunks in the first place)

sphuber commented 1 year ago

Thanks for the report @giovannipizzi . Could you please provide the inputs passed to the calculation so we can reproduce it and/or the output file containing the error?

mbercx commented 1 year ago

However, I think we should implement in this case some error handlers to just split the calculation in chunks and then merge it back (or even know already that this will happen, and run it in chunks in the first place)

Can we even do this? I.e. can we tell Quantum ESPRESSO we only want to run i.e. half the k-points in one run, then run the other half, and merge the output files somehow that it all works in the end? I know this is possible for q-points, but not sure about k-points.

EDIT: Of course I'm talking about the NSCF here. Doing this for the SCF (where this can also happen) would lead to completely incorrect results.

Next to this, I'm also not sure if we want to do this. Typically an error handler simply changes some inputs and reruns the calculation. Here we'd need a work chain that splits up the k-points, runs a bunch more restart work chains and then merges everything together nicely. From my experience with doing this for q-points, this can also be anything but trivial to implement and would add a bunch of code to maintain, all because QE has a hard-coded limit on the number of k-points, a setting for which you have to recompile the code to adapt.

So I would vote against trying to fix "too many k-points" errors with elaborate error handling via a parallelised work chain. What we can do is catch this error and add an exit code for it with clear instructions on how to adapt the maximum number of k-points in the Quantum ESPRESSO compilation.

mbercx commented 1 year ago

One more note: maybe we should also look at the protocol for the NSCF part. I mean we just arbitrarily set kpoints_distance to a very dense 0.05 for precision. Perhaps this is completely unnecessary and reducing this density would lead to less failures.