Closed JPchico closed 1 year ago
This can be solved by crating a calcfunction that is called at the workflow level, The callcfunction will be available for the users that want to use it to keep the provenance at the calculation level, but it is called still in the calculation but as a normal function, thus avoiding possible "non-lammps" friendly structures from being run. If they are already transformed nothing should happen.
This is being handled in the workchain, where the generation of the structure is a calcfunction
. The only thing this calcfunction
does is to generate the LAMPPS
compatible structure format.
Of this way the provenance on how the structure is modified is kept.
The modification is run once more internally as a function (as it is a part of the calculation), however, this should do nothing, as the transformation matrix in this case should be the identity matrix.
I was looking at this again, and I have added the calcfunction
as it it discussed here.
However, I do wonder if it wise to have it always on, i.e. at the calculation level.
We are going to be adding a workchain
that is what the users are meant to use anyhow, and in there adding the calcfunction
makes sense since we can keep track of the provenance and so on.
However, if someone decides to use the low-level calculation maybe it is fair to assume that they should keep track on whether or not one is running with a LAMMPS
friendly structure.
What do you think @chrisjsewell , @sphuber? Any strong opinions on the matter?
If I understand correctly, the generate_lammps_structure
is merely writing the representation of the structure as defined by the StructureData
input, into a different representation, one that LAMMPS understands. Is it actually functionally changing the structure, or is it merely a different representation? If it is the latter, I think we don't need a calcfunction at all, and having the CalcJob
plugin do this internally is perfectly line. That is exactly what the calcjob plugin is supposed to be doing: taking input represented by Data
nodes and writing it in a representation that the target code understands.
While the transformation is such that the structure is the same by symmetry this can be confusing to the user as the input and output structure would look quite different.
Is there always an output structure? Even for calculations where the structure is not modified, such as a single point calculation of total energy? In that case, I guess it would be a bit weird to see a different output structure compared to the input; but I would say that an output structure is not even needed in this case. In any other calculations that are meant to change the structure, such as in optimizations or dynamics, it makes perfect sense to me that the output structure will be different.
As far as I understand it should just be a different representation, that is correct. In that case the structure is not changing itself, it is just being represented in a different way.
Is there always an output structure? Even for calculations where the structure is not modified, such as a single point calculation of total energy? In that case, I guess it would be a bit weird to see a different output structure compared to the input; but I would say that an output structure is not even needed in this case. In any other calculations that are meant to change the structure, such as in optimizations or dynamics, it makes perfect sense to me that the output structure will be different.
I think that by default we now have always an output structure, I had not thought about the single point energy case, that is a good point.
I agree with your point that the structure before and after the calculation should be different (unless nothing happened of course), my concern was mostly about the user wondering about why suddenly the cell could be in a different representation. But perhaps it is just simpler to instead say this in the documentation, as the structure is not really different, just written in a different way.
But perhaps it is just simpler to instead say this in the documentation, as the structure is not really different, just written in a different way.
I think this is fair. It prevents additional complexity and always having a bunch of extra nodes in the provenance graph that are perhaps "just" bloat.
I agree, so unless there are any objections I will just add a section in the documentation about this and leave as it is (I made some small changes to not rely in the ase structure if not just use the base node instead)
I noticed that I already had included a note about this in the documentation.
LAMMPS requires the simulation cell to be in the format of a lower triangular matrix (right-handed basis). Therefore the cell and positions may require rotation and inversion. This is automatically done to every structure at the calculation level, so it might be that the cell that is provided is modified so that it follows this convention.
So unless there are more reasons to be explicit with this, I think that I'll push the PR with the small changes that I made so that one does not need to depend on the ASE abstraction of the structure and then we can consider this closed. Any objections @sphuber and @chrisjsewell ?
Something that I noticed in the structure of the code when checking this is that the common
folder is actually a set of parser utilities (mostly related to the input generation). I think that for clarity I'll move these functions to the parser folder, as that is where they are used and I think that is where they should be.
Closed via #70
Lammps seems to require that the structure be written in a specific way as to function. Right now this is taken care in the code by the
generate_lammps_structure
function. Which means that the structure that is actually passed to the run in not exactly the same than the one in the input, it is transformed.While the transformation is such that the structure is the same by symmetry this can be confusing to the user as the input and output structure would look quite different.
The question is then, is this transformation always needed? Should one transform the output structure so that it matches what one expected from the input (by using the inverse transformation)?
This is tricky since this happens at the
Calculation
level and hence we cannot call acalcfunction
to keep track of what is happening to the structure itself.