openkim / kliff

KIM-based Learning-Integrated Fitting Framework for interatomic potentials.
https://kliff.readthedocs.io
GNU Lesser General Public License v2.1
34 stars 20 forks source link

Fixing uq pool #66

Closed yonatank93 closed 2 years ago

yonatank93 commented 2 years ago

Summary

Update the uq module to behave better when we run the sampling in parallel, especially when we use hybrid parallelization using multithreading in loss evaluation and MPI in sampling.

This is related to Issue #62 .

Known Issue

Fix

yonatank93 commented 2 years ago

@mjwen Can you take a look at this? And do we want to add a documentation in "How To: Run in parallel mode" about possible parallelization when running MCMC sampling? For example, by showing what user can do to use multithreading in loss evaluation and MPI in sampling, and point out that currently we don't support MPI in loss evaluation when running MCMC sampling.

mjwen commented 2 years ago

Thank you @yonatank93 for the PR! I will take a look at this soon. In the meantime, can you fix to let the linting and test pass? For the test, it seems ptemcee is needed, but not installed in the GH actions. You may want to install it, like, below this line.

yonatank93 commented 2 years ago

I have updated so that the linting and test pass. And coming back to my question, do we want to provide an example/discussion about how to run multiprocessing for UQ?

mjwen commented 2 years ago

I have updated so that the linting and test pass. And coming back to my question, do we want to provide an example/discussion about how to run multiprocessing for UQ?

Yes, I think this would be great! This would be as similar as the existing UQ example with changes only reflecting the part.

mjwen commented 2 years ago

@yonatank93 You probably did not see my previous reply. The PR is ready to be merged, any last changes you want to make?

yonatank93 commented 2 years ago

@mjwen Currently, I only mention it briefly about parallelization for the MCMC sampling in the example. Do we want to add a more detail documentation about it?

I also haven't figured out how to avoid using global variable. I want to work on this, but it can be for another PR.

yonatank93 commented 2 years ago

@mjwen I just added a more thorough documentation about parallelization in MCMC. Do you have any comments on that? Otherwise, I don't have any other changes for this PR.