Closed t-reents closed 9 months ago
Hi @t-reents, thanks for re-posting the issue more extensively. I don't quite understand what do you mean [1, 1, 1, 1, 1, 1] per atom?
Probably, in order not to complicate too much the logic, we can define a maximum number of *, instead of the concept of batches. So we would have
max_concurrent_atoms
)max_concurrent_qpoints
)So, say we define max_concurrent_atoms=2
and max_concurrent_qpoints=3
.
Let's say our structure has 7 atoms to perturbed, and, for simplicity, 8 qpoints for each atom (remember that in principle this number may change among different atoms within the same structure).
We would then have that:
HpParallelizeAtoms
launches 2 HpParallelizeQpoints
HpParallelizaQpoints
launches 3 HpBase
HpBase
HpBase
HpParallelizeAtoms
launches the following 2 HpParallelizeQpoints
for the next 2 atoms.HpParallelizaQpoints
for the last (7th) atom.Is this what you had in mind?
A further possibility could be to have also single HpBase
that runs more than 1 qpoint at a time (e.g. by setting start_qpoint=i
and last_qpoint=i+n
). This may be useful for smaller structures having lots of qpoints. On the other hand, since hp.x
doesn't have restart options, I think eventually it's less appealing this approach (see also #34).
Yes, this is exactly the logic that I implemented in my first draft.
With the term "batches", I was basically referring to what you described in your outline. E.g. the second point, I also meant that instead of submitting all the HpBase
at once, we submit a HpBase
batch of size 3. So basically just a different wording.
Instead of specifying the number of concurrent atoms and qpoints separately, I came up with the idea of specifying only the total number. Based on the number of atoms/sites that is determined after the initialization, the number of HpBase
for the qpoints is distributed accordingly. The length of the list corresponds to the number of concurrent HpParallelizeQpoints
and the elements to the number of concurrent HpBase
in each of them. The function tries to distribute the number of HpBase
uniformly over the different HpParallelizeQpoints
.
I don't quite understand what do you mean [1, 1, 1, 1, 1, 1] per atom?
The list encoded the following outline:
HpParallelizeAtoms
launches 5 HpParallelizeQpoints
(length of list)HpParallelizaQpoints
launches 1 HpBase
(the elements of the list)The first example "[2, 2, 1]" represented accordingly:
HpParallelizeAtoms
launches 3 HpParallelizeQpoints
(length of list)HpParallelizaQpoints
launch 2 HpBase
and the third one only 1 (the elements of the list)This was just to explain what I meant with this list notation. But I would also agree on the separate inputs, so up to you
Interesting approach. So you would define at higher level (HpWorkChain
) a max_concurrent_base_workchains
, and then define accondingly the "sub-maximum number of concurrent workchains"?
What I like is that it will indeed at maximum run a certain number of concurrent HpBaseWorkChain
s, although it won't be guaranteed that it will be the extact number of HpBaseWorkChain
running concurrently.
One thing to consider is the fact that, say one atom has lots of qpoints, then all the other remaining workchains will have to wait. In the AiiDA workflows, AiiDA will have to wait all the submitted processes before continuing in the outline. Considering the amount of qpoints differs slightly among different atoms, I guess it is acceptable. We can also start trying it out, and in any case this will be optional.
Yes exactly, this was the idea and I agree with your comments
It would be nice if the user would have more control over the number of
WorkChains
that are submitted at a time.All of the sub-
WorkChains
, e.g. inSelfConsistentHubbardWorkChain
andHpParallelizeQpointsWorkChain
, are submitted at once, at the moment. This might lead to many submissions at the same time, which might be problematic for some clusters, or in general, sum up to a big amount ofWorkChains
in case one runs e.g. multipleHpParallelizeAtomsWorkChains
. I was figuring that it might be useful to introduce something like submission in batches, e.g. submitting only N sub-WorkChains
at a time, which would allow the user to use the advantages of the parallel submission to a certain extend but still control the number of parallelWorkChains
.I already prepared a draft locally, the PR will follow soon. I would simply add a new input specifying the number of
WorkChains
per q-point submitted a time for theHpParallelizeAtomsWorkChain
andHpParallelizeQpointsWorkChain
, given that the parallelization over q-points is enabled.The
SelfConsistentHubbardWorkChain
would also get a new input to specify the overall number ofWorkChains
submitted at a time (HpParallelizeAtomsWorkChains
andHpParallelizeQpointsWorkChains
). I thought about the following logic so far:Assuming we set the new input to 5 and our structure contains 3 disturbed atoms. Those 5 allowed
WorkChains
would be distributed as follows: [2, 2, 1]. This list refers to the number of q-point-WorkChains
per atom that are submitted at a time. EachHpParallelizeQpointsWorkChain
will continue submitting new q-point batches once the previous batch has finished.Another example: Setting the input again to 5, but this time, we have 6 perturbed atoms. This would result in the submission of [1, 1, 1, 1, 1] q-point-
WorkChains
per atom. Once theHpParallelizeQpointsWorkChains
are done, the lastHpParallelizeAtomsWorkChain
will be submitted but this time with 5 possible q-point-WorkChains
, since we ensure that the others are done at this stage.@bastonero In case you already have comments regarding this logic, please feel free to share them here. Otherwise, we can discuss once the PR is there