glotzerlab / freud

Powerful, efficient particle trajectory analysis in scientific Python.
https://freud.readthedocs.io
BSD 3-Clause "New" or "Revised" License
282 stars 49 forks source link

Add class method for Box class which create a box from Lx,Ly,Lz and angles #1234

Closed DomFijan closed 7 months ago

DomFijan commented 8 months ago

Implements classmethods that create a box class from lattice vector matrix and L1, L2, L3 and angles, as well as methods that give L1, L2, L3 and angles from box class.

Description

Crystals are usually defined via their lattice vectors and wykoff sites or lengths of lattice vectors and angles; while freud and hoomd use cell parameter matrix. In this PR I implement class method for creation of box from lattice vectors or from lattice vector lengths and angles. I've also implemented a method that gives lengths of box vectors and angles from box object.

Motivation and Context

Lattice vectors or lattice vector lengths and angles are used to define real crystals and having a method that works with this makes creating crystals easier.

How Has This Been Tested?

Tests were added for new methods for box testing.

Types of changes

Checklist:

janbridley commented 8 months ago

I personally am a proponent of using math.sqrt and math.acos, as for single values it is significantly faster than the numpy alternatives. The difference is on the order of μs, but free performance is free imo.

DomFijan commented 8 months ago

I personally am a proponent of using math.sqrt and math.acos, as for single values it is significantly faster than the numpy alternatives. The difference is on the order of μs, but free performance is free imo.

It seems numpy sqrt is used throughout freud from what I can tell. If we were to start using math functions we should change all occurrences. I agree it makes no sense to leave performance on the table, but realistically the differences will be so small they are inconsequential. Perhaps this would be a good first issue to tackle for someone new?

janbridley commented 8 months ago

I personally am a proponent of using math.sqrt and math.acos, as for single values it is significantly faster than the numpy alternatives. The difference is on the order of μs, but free performance is free imo.

It seems numpy sqrt is used throughout freud from what I can tell. If we were to start using math functions we should change all occurrences. I agree it makes no sense to leave performance on the table, but realistically the differences will be so small they are inconsequential. Perhaps this would be a good first issue to tackle for someone new?

Noted, and sounds good! I will raise an issue.

Edit: See #1236

tommy-waltmann commented 8 months ago

I personally am a proponent of using math.sqrt and math.acos, as for single values it is significantly faster than the numpy alternatives. The difference is on the order of μs, but free performance is free imo.

Keep in mind freud modules are compiled with Cython, I wouldn't trust any performance data that is done outside a cython-compiled python module for use in freud.

DomFijan commented 8 months ago

I think it makes sense to remove the class methods for UnitCell for now and to completely remove from_lattice_vectors for now. We can discuss further how this should be approached and implement more substantial changes in a future PR. I will update PR description accordingly.

ispivack commented 8 months ago

All looks good to me, just one comment. If we want to just use freud for the EBT box, it would be good to have a convenience function converting to a unit vector + lengths description. Do we feel that this would fit into freud, or should this just be separately implemented in ebonds?

tommy-waltmann commented 8 months ago

All looks good to me, just one comment. If we want to just use freud for the EBT box, it would be good to have a convenience function converting to a unit vector + lengths description. Do we feel that this would fit into freud, or should this just be separately implemented in ebonds?

I would start by adding the functionality into ebonds. If it becomes cumbersome to have to do the conversion in ebonds constantly, then adding a method to freud to do that might be worth it.

Right now, the best way to get lengths and unit vectors would be to get the box vectors (they are the columns of box.to_matrix()) and get magnitude and direction from that.

janbridley commented 8 months ago

np.testing.assert_allclose sets atol=0 by default. This will not work with floating point calculations, so I set atol to 1e-14. However, the tests are still failing due to mismatched NaNs.

tommy-waltmann commented 8 months ago

np.testing.assert_allclose sets atol=0 by default. This will not work with floating point calculations, so I set atol to 1e-14. However, the tests are still failing due to mismatched NaNs.

It's fine to leave atol as 0.0. According to the documentation for assert_allclose: "It compares the difference between actual and desired to atol + rtol * abs(desired)", so having an rtol is enough to account for floating point error.

joaander commented 8 months ago

np.testing.assert_allclose sets atol=0 by default. This will not work with floating point calculations, so I set atol to 1e-14. However, the tests are still failing due to mismatched NaNs.

It's fine to leave atol as 0.0. According to the documentation for assert_allclose: "It compares the difference between actual and desired to atol + rtol * abs(desired)", so having an rtol is enough to account for floating point error.

A non-zero atol is needed when a value close to zero should be accepted as a desired (exactly) zero value. atol + rtol * abs(desired) == 0.0 when atol=0 and desired=0.

janbridley commented 7 months ago

I have tested this method against box lengths and angles from 1,104 real CIF files and all cases pass. I can export the box params as a numpy array and save it as test data, or we can adjust the current test to be more realistic.

tommy-waltmann commented 7 months ago

I have tested this method against box lengths and angles from 1,104 real CIF files and all cases pass. I can export the box params as a numpy array and save it as test data, or we can adjust the current test to be more realistic.

The code should be written so invalid sets of angle inputs raise an error to the user instead of creating invalid boxes. This PR needs an update to the code, not just the tests, before it's ready to be merged.

Ideally, we would know the mathematical constraints on the box angles so we could check that the inputs are valid, and we should make an effort to figure out what those are. If we cannot get those relations, then we could move forward with an updated set of tests and a "garbage in, garbage out" note in the documentation, but again, I only think we should take the second option if we have already given the first option some effort and failed.

DomFijan commented 7 months ago

I think I figured it out. The restriction is the following: $1+2\cos(\alpha)\cos(\beta)\cos(\gamma) - \cos(\alpha)^2 - \cos(\beta)^2 - \cos(\gamma)^2>0$ This is part of the computation of volume of a Parallelepiped. This guarantees that the volume of the box defined is positive and must always be true. This statement should be equivalent to requiring the value under the sqrt in the from_box_lengths_and_angles function be positive (haven't done the maths to confirm that).

The geometric interpretation of this is the violation of the cosine law.