Materials-Consortia / OPTIMADE

Specification of a common REST API for access to materials databases
https://optimade.org/specification
Creative Commons Attribution 4.0 International
83 stars 37 forks source link

Does dimension_types refer to the periodicity of the volume or of the structure? #390

Open JPBergsma opened 2 years ago

JPBergsma commented 2 years ago

I just looked at some data in 2Dmatpedia. They simulate 2D materials. I however noticed that in one of their structures, some of the atoms stick through the top of the simulation volume. Apparently their simulation was performed in a box that was periodic in three dimensions yet the structure itself is only periodic in 2 directions. I was therefore wondering what the correct value of the "dimension_types" field would be in this case. Once we reach a conclusion about this, I can update the description in the OPTIMADE standard.

merkys commented 2 years ago

Good question. I have always thought dimension_types referred to the volume. But after re-reading the examples in the specification I am now a bit confused.

Looking back, I can find it was #23 where dimension_types has been introduced. A comment there points to the definition of the same property in ESCDF, which now seems to be described here: https://gitlab.com/ElectronicStructureLibrary/escdf/escdf-specifications/-/blob/master/source/system.rst (link in OPTIMADE issue is broken and Wayback Machine does not help).

Since ESCDF does not say anything about molecules, I assume the periodicity of the volume is meant. But I may be wrong.

JPBergsma commented 2 years ago

After reading the definition in ESCDF, I also think that the periodicity of the simulation volume is the intended periodicity of the dimension_types field. I'll try to make a quick PR to clarify this in the OPTIMADE standard. Edit: Under nperiodic_dimensions it is specifically mentioned that it is about the structures, So I think I will wait until I have heard a few more opinions on this.

merkys commented 2 years ago

Edit: Under nperiodic_dimensions it is specifically mentioned that it is about the structures, So I think I will wait until I have heard a few more opinions on this.

To me the following sentence from the description explains quite the contrary:

This property only reflects the treatment of the lattice vectors provided for the structure, and not any physical interpretation of the dimensionality of its contents.

Here "structure" means the simulated/real crystal (not molecule), I think. At least it seems that the specification is using the term "structure" to mean that.

You may ping authors of #23 or 13a766cd226925974183761fcc7f19de5f19d26d for their opinions.

JPBergsma commented 2 years ago

I looked at the examples under nperiodic_dimensions which use the word structure. And I interpreted structure as: "a group of atoms that are attached to each other". Hence my doubt. The sentence you highlighted makes it a lot clearer, though. So I will make a short PR about this.

merkys commented 2 years ago

While it is true that the specification is quite implicit about the actual meaning of "structure", I do not recall a requirement of all atoms to be bound. I think of the "structure" as input/output of a simulation (or result of an experiment), thus there should be no implications about connectivity. Anyway, a clarifying PR would be great.

rartino commented 2 years ago

Identifying unclear formulations and proposing clarifications is always welcome!

For me, the key phrase in the specification right now is (emphasis added):

For each of the three directions indicated by the three lattice vectors (see property lattice_vectors), this list indicates if the direction is periodic (value 1) or non-periodic (value 0).

This surely isn't perfectly phrased (can a "direction" be periodic? Isn't a direction 'a vector without a magnitude'?), but I find it really difficult to read this definition as saying anything else than that dimension_types specifies whether the unit cell has periodic boundary conditions along the respective lattice vector or not. I'm also fairly confident this was the original intent back when these fields were standardized as part of an OPTIMADE workhop.

@JPBergsma in the text about nperiodic_dimensions, IMO "structure" should be read as referring "an OPTIMADE structure", i.e., what can be described by the structure endpoint. I think you misread that text if you think "structure" means the underlying structural entity.

JPBergsma commented 2 years ago

And I interpreted structure as: "a group of atoms that are attached to each other". This was probably not the clearest way to describe it. I meant that you can view a structure as a set of atom positions. I have tried to make it a bit clearer in PR #394

BobHanson commented 2 years ago

2D structures are generally referred to as "slabs" or "surfaces". They have depth (some distance to allow for the number of layers of atoms to be included in the calculation), and they have "vacuum" -- the void that is used in some calculations in order for the program to embed the 2D structure model within a 3D calculation environment. It is not unusual to have the "nonperiodic" lattice vector length to be 500 -- simply a flag for the calculation program that this direction is for all practical purposes nonperiodic. (In the program, it is periodic, but the distance is so huge as to not allow any interaction between layers).

Materials Studio (as well as Jmol now) allows the user to create a "slab box" based on a Miller plane, then "cleave" using that box to select a depth of atoms relative to that plane. The box is created to perfectly match the atoms in the nonperiodic direction -- with atoms at both the 0- and 1- planes of the box. After this, the step is to position the atoms within a larger box along the nonperiodic direction and add some amount of "vacuum" to pad a 3D box at the top and bottom of the box.

If one happens to have atoms in the 0-plane of that box when the vacuum is added, the result in Materials Studio is that those 0-plane atoms are moved to the 1-plane. (I don't know why. My guess is that the idea is to create a surface that removes bulk atoms but includes atoms above it.) In any case, when you see a "split" model -- with most of the model at the bottom of the box, and a monolayer of atoms at the top of the box, what you are seeing is the result of just such a vacuum configuration. From a practical sense, it doesn't matter whether the monolayer is at the top or the bottom of the box -- the calculation will consider this dimension periodic for calculational purposes.

See, for example, https://chemapps.stolaf.edu/jmol/jsmol/simple2.htm?load%20http://optimade.2dmatpedia.org/v1/structures

Here we have a 2D structure -- a surface -- embedded in a 3D box. The periodicity of the system is, technically, 3D, but everyone understands that this is a 2D model expressed in a 3D box. And, most importantly, the calculation works.

This is why I said earlier in the workshop that the statement that the length of the nonperiodic vector "is not significant" is not really true. It is significant in terms of the calculation -- at least if one wants to know how the original calculation was carried out and perhaps duplicate or modify that calculation.

What I see at https://chemapps.stolaf.edu/jmol/jsmol/simple2.htm?load%20http://optimade.2dmatpedia.org/v1/structures is fine. It's just that the explanation in the specification is not particularly helpful.

JPBergsma commented 2 years ago

As a human, I have no problem interpreting the structure @BobHanson mentioned. But I think that automatically splitting the structure in a reliable way may be difficult in some cases. For example, when there is an undulating structure, or the planar structure is tilted relative to the cell, so it passes through the cell multiple times. Perhaps we should have a second field similar to nperiodic_dimensionsfor example n_structure dimensions that indicates in how many directions the structure is periodic.