HEXRD / hexrd

A cross-platform, open-source library for the analysis of X-ray diffraction data.
Other
56 stars 25 forks source link

Fix remove duplicate atoms #691

Closed saransh13 closed 1 month ago

saransh13 commented 1 month ago

The routine had some errors as pointed out by @ZackAttack614. More careful testing revealed some duplications and removal of sites with fractional occupancies. This PR is meant to fix these issues. There are a number of break statements in the function. So, stylistically the routine could be improved a little. In any case, this fixes #686.

pep8speaks commented 1 month ago

Hello @saransh13! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! :beers:

Comment last updated at 2024-08-06 22:04:05 UTC
codecov[bot] commented 1 month ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 33.18%. Comparing base (54fffca) to head (395c760). Report is 11 commits behind head on master.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #691 +/- ## ========================================== + Coverage 31.83% 33.18% +1.35% ========================================== Files 130 130 Lines 21900 21698 -202 ========================================== + Hits 6971 7201 +230 + Misses 14929 14497 -432 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

kevindlewis23 commented 1 month ago

It sounds like you did some testing when you made this update, it shouldn't be very hard to incorporate those into unit tests. If you add a file in the tests directory, pytest will automatically run any function whose name starts with test_, and make sure no errors are thrown (that all assert statements pass).

You'll need to run pip install -r tests/requirements-dev.txt and set the environment variable "HEXRD_EXAMPLE_REPO_PATH" to poing to the hexrd examples repo. Then you should be able to just run pytest tests or pytest tests/[filename] to run all the tests or your specific tests to make sure they pass.

It seems like you've already done some testing, and the ideal testing suite is always to have a few normal examples as well as some edge cases, but for cases where that is more difficult to do, I've randomly generated a bunch of testcases and written code to verify the results. Let me know if you need any help with this, you can reach out to me on the hexrd slack

ZackAttack614 commented 1 month ago

@saransh13 are you able to get to this soon? I'm happy to provide assistance if needed. We would like to get dangling issues resolved this week so that we can move forward with the hexrd v1.0 implementation.

saransh13 commented 1 month ago

@kevindlewis23 Here's the test script (there is an associated .h5 file it reads data from that I can send). Some help in getting some test for this function would be very much appreciated!

from hexrd import material
from hexrd.material import load_materials_hdf5
import numpy as np
import h5py

f = h5py.File('testmat.h5')

mats = load_materials_hdf5(f)

print('=='*20)
print('original atomic positions')
print(mats['xtal1'].atom_pos)
mats['xtal1'].unitcell.remove_duplicate_atoms()
print('reduced atomic positions')
print(mats['xtal1'].atom_pos)

print('=='*20)
print('original atomic positions')
print(mats['xtal2'].atom_pos)
mats['xtal2'].unitcell.remove_duplicate_atoms()
print('reduced atomic positions')
print(mats['xtal2'].atom_pos)

print('=='*20)
print('original atomic positions')
print(mats['xtal3'].atom_pos)
mats['xtal3'].unitcell.remove_duplicate_atoms()
print('reduced atomic positions')
print(mats['xtal3'].atom_pos)

The output it produces is as follows:


original atomic positions
[[0.  0.  0.  1. ]
 [0.5 0.  0.5 1. ]]
reduced atomic positions
[[0. 0. 0. 1.]]
========================================
original atomic positions
[[0.  0.  0.  0.5]
 [0.  0.  0.  0.5]]
reduced atomic positions
[[0.  0.  0.  0.5]
 [0.  0.  0.  0.5]]
========================================
original atomic positions
[[0.         0.         0.         0.33333334]
 [0.         0.         0.         0.33333334]
 [0.         0.         0.         0.33333334]
 [0.5        0.         0.         1.        ]
 [0.5        0.5        0.25       1.        ]
 [0.5        0.5        0.         1.        ]]
reduced atomic positions
[[0.         0.         0.         0.33333334]
 [0.         0.         0.         0.33333334]
 [0.         0.         0.         0.33333334]
 [0.5        0.         0.         1.        ]
 [0.5        0.5        0.25       1.        ]]```
saransh13 commented 1 month ago

@ZackAttack614 Do you mind reviewing this (sans the unit test)? I'll work with @kevindlewis23 in the coming weeks to have this covered.

kevindlewis23 commented 1 month ago

The third test does not pass on the old implementation, was it wrong before?

Also can you fix the PEP8 warnings above?

saransh13 commented 1 month ago

It was only handling the case for two disordered atoms on the same site correctly e.g. CuZn. I updated it to correctly handle an arbitrary number of atoms e.g. High entropy alloys.

kevindlewis23 commented 1 month ago

With the PEP8 fixes (you can run the Black formatter which should fix it automatically) this should be good, but I think Zack needs to review this to merge since he requested changes. I'm a little bit worried this is missing a few potential testcases. I think there's only one that actually removes any atoms, and you never remove more than one. It would also be useful to add a testcase where two atoms are in the same place exactly but has to remove it (not a disordered alloy)

saransh13 commented 1 month ago

@kevindlewis23 I think the exact same position case is the trivial case (subset of the first test case). In any case, I have changed the positions to check the trivial case. This is ready to be merged now.