materialsproject / emmet

Be a master builder of databases of material properties. Avoid the Kragle.
https://materialsproject.github.io/emmet/
Other
52 stars 64 forks source link

Add `run_ddec6` option to `Calculation.from_vasp_files` #869

Closed janosh closed 9 months ago

janosh commented 10 months ago

This PR adds the functionality needed to automatically run a DDEC6 charge analysis after any VASP calculation with atomate2. See https://github.com/materialsproject/atomate2/pull/587.

@chiang-yuan Should the doc string contain some instructions on how to set DDEC6_ATOMIC_DENSITIES_DIR and where to get those densities?

chiang-yuan commented 10 months ago

Yes I totally agree we should have that. It didn't require me long time to figure out what the files are and where to get them. They are in the author's repo here. Perhaps we should keep a copy/snapshot of all the files in case the authors are not actively maintaining the repo or just delete the files in the future?

janosh commented 10 months ago

Perhaps we should keep a copy/snapshot of all the files in case the authors are not actively maintaining the repo or just deleted the files in the future?

How big are the files gzipped? If they're < 50KB, we could keep a compressed copy in pymatgen and try downloading those files on the fly when users run chargemol.

codecov-commenter commented 10 months ago

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (24a74ea) 91.36% compared to head (6a3c0e5) 91.35%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #869 +/- ## ========================================== - Coverage 91.36% 91.35% -0.01% ========================================== Files 138 138 Lines 12748 12753 +5 ========================================== + Hits 11647 11651 +4 - Misses 1101 1102 +1 ``` | [Files](https://app.codecov.io/gh/materialsproject/emmet/pull/869?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=materialsproject) | Coverage Δ | | |---|---|---| | [emmet-core/emmet/core/vasp/calculation.py](https://app.codecov.io/gh/materialsproject/emmet/pull/869?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=materialsproject#diff-ZW1tZXQtY29yZS9lbW1ldC9jb3JlL3Zhc3AvY2FsY3VsYXRpb24ucHk=) | `76.25% <88.88%> (+0.05%)` | :arrow_up: |

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

Andrew-S-Rosen commented 10 months ago

@janosh and @chiang-yuan:

For your own benefit, please read the email I sent to Tom Manz below a long time ago, as it will be relevant for you if you use this feature. In short, run a test case on something like NaCl a few times and make sure it works as expected on your given compute node. I think it crops up when using a compute node with many CPUs for a small system.

Dear Dr. Manz,

I have run DDEC6 via Chargemol without issue for several years now. However, I wanted to bring to your attention some strange behavior when running the parallel version of Chargemol on certain machine architectures. I understand that this behavior may be difficult to reproduce, but I share it here with you for completeness anyway.

Summary: When using the parallel version of Chargemol (i.e. Chargemol_09_26_2017_linux_parallel) on the head node of NERSC's Cori machine (Intel Xeon Processor E5-2698), the predicted partial atomic charges (and other properties) are erratic on simple test systems. For instance, for the NaCl test case provided with Chargemol, the partial charge on Na is predicted to be -1.9, whereas with the serial executable it is +0.843 (as expected). Rerunning the parallel code yields very different partial charges on most runs, whereas the serial version consistently yields +0.843.

Importantly, I do not have this issue on all machine architectures, which has made it difficult to pinpoint the issue. Running the exact same setup on Intel Xeon nodes on a separate machine (one I've historically used for such calculations) yields the expected partial charges for NaCl every time regardless of the executable.

The Data: I have attached to this email the exact files I used to run Chargemol as well as the outputs. The "serial" and "parallel" folders refer to the serial and parallel Linux binaries of Chargemol, respectively.

Perhaps you may have some insight as to the root cause. If so, please let me know if there is any troubleshooting I can do to help out further. nacl_tests.zip

The response from Tom:

It seems like it is due to some issues of the system architecture and/or the compiler that was used on that system. You may need to recompile the code on that system using a few different versions of the GNU fortran compiler to see if one of them solves the issue. I've noticed that sometimes a particular version of the GCC fortran compiler works when another will produce errors when the code runs. So, my recommendation is to explore some options with different GCC fortran compiler versions.

If you transferred the code from one system to another without recompiling it, this sometimes works if the two systems have close enough architectures. But, it may be in your case that the architecture is different enough that if the code was compiled on a vastly different architecture that it needs to be recompiled.

So, in summary ... I think your issue sounds like a compiler issue.

Andrew-S-Rosen commented 10 months ago

@chiang-yuan: That is not the repo for DDEC6! It is here: https://sourceforge.net/projects/ddec/files/. The repo you linked is unaffiliated with the Chargemol project.

chiang-yuan commented 10 months ago

@Andrew-S-Rosen Thanks for the catch. You are definitely spot on, although the time stamps on two side are the same... I should be more careful about this...

chiang-yuan commented 10 months ago

@janosh I compressed the whole directory into tar.gz and it is about 2.3M. Sounds like it is not very small?

chiang-yuan commented 10 months ago

@Andrew-S-Rosen to the information you provided above, I have been compiling ddec charges for all MP computed entries. As far as I've seen for Na, there is no such erroneous entry with Na of -1.9. Very few of them are mildly negatively charged (roughly around or above -0.5) but those might be charge sloshing in pure metal.

Andrew-S-Rosen commented 10 months ago

Good good! That's consistent with my experience too but just wanted you to be alerted. I only saw it on the head node on Cori, not the compute node, so who knows. And of course, Cori is gone now.

janosh commented 10 months ago

2.3M is not great but still ok. If they've been hosted reliably for a long time where they are now, we can also pull from there.