openkim / kliff

KIM-based Learning-Integrated Fitting Framework for interatomic potentials.
https://kliff.readthedocs.io
GNU Lesser General Public License v2.1
34 stars 20 forks source link

Feature request: Parallelization of Model Creation #76

Open adityakavalur opened 1 year ago

adityakavalur commented 1 year ago

I am trying to develop a physics-based model: LJ potential using OpenKIM LJ_ElliottAkerson_2015 and DFT data that runs into 1000s of configuration snapshots with 100s of atoms in each and it seems to take a long time.

I am playing around with a smaller problem i.e. smaller dataset and severely restricted number of minimization iterations and based on the profiling results it looks like _update_neigh in kliff/models/kim.py is the main bottleneck. However, the calc.create command that calls this is not parallelized.

So my questions are as follows:

  1. Have you all faced this before?
  2. Is there a reason why this is not parallelized; If no - then is this on your TODO list?
  3. If you are not currently working on this, would you accept an MR?

For the above point 3, I poked around a little bit and it seems like using the parmap1 function on KIMComputeArguments here would probably be the easiest way to proceed.

Thanks for developing this utility! It is really nifty and will hopefully standardize this aspect of atomistic modeling. Please let me know if anything above is unclear!

mjwen commented 1 year ago

Right, it is not parallelized. The reason is that this only needs to be done once at the beginning, before any optimization happens. For LJ, it is probably not ideal, because LJ is cheap to evaluate.

A MR is very welcome and your suggestion seems the right way to go!

adityakavalur commented 1 year ago

Hi @mjwen thanks for the reply Unfortunately, I have run into a blocker that I need some help with. I provide details below if you want a working example but the crux of the situation is that I run into the error TypeError: can't pickle kimpy.compute_arguments.PyComputeArguments objects when I try to parallelize it, i.e. the object class that is being returned here is not serializable as is. I tried to keep modifications to a minimum so I used parmap2 to create parmap21 so that multiple non-sequential arguments can be iterated. However, if we are not able to send the entire object class across a pipe, there seems to be 2 options

  1. Make the function parmap21 very non-generic, i.e. intialize all the classes on the primary process and try to get back only the neighbor list, which is the expensive computation
  2. Use get_state /save_state to severely reduce the object being sent across the pipe something leveraging this

Both of these are of course not ideal. If you can think of a better way please let me know. I am not a Python expert by any stretch of the imagination.

How-to Reproduce the problem:

  1. Use my fork https://github.com/adityakavalur/kliff
  2. Use the below modified KLIFF tutorial example
"""
.. _tut_lj:

Train a Lennard-Jones potential
===============================

In this tutorial, we train a Lennard-Jones potential that is build in KLIFF (i.e. not
models archived on OpenKIM_). From a user's perspective, a KLIFF built-in model is not
different from a KIM model.

Compare this with :ref:`tut_kim_sw`.

.. _OpenKIM: https://openkim.org
"""
from kliff.calculators import Calculator
from kliff.dataset import Dataset
from kliff.loss import Loss
from kliff.models import LennardJones
from kliff.utils import download_dataset

from kliff.models import KIMModel

# training set
dataset_path = download_dataset(dataset_name="Si_training_set_4_configs")
tset = Dataset(dataset_path)
configs = tset.get_configs()

# calculator
model = KIMModel(model_name="LJ_ElliottAkerson_2015")

# fitting parameters
epslist = [[0.0,"fix"]]*9730
siglist = [[0.0,"fix"]]*9730
cutlist = [[0.0,"fix"]]*9730

epslist[1729] = ["default", 0.001, None]
siglist[1729] = ["default", 1.977, None]
cutlist[1729] = ["default", "fix"]
model.set_opt_params(cutoffs=cutlist,sigmas=siglist, epsilons=epslist)

calc = Calculator(model)
#calc.create(configs, use_energy=True, use_forces=False)
calc.create(configs, use_energy=True, use_forces=False, nprocs=2)

# loss
loss = Loss(calc, nprocs=1)
result = loss.minimize(method="L-BFGS-B", options={"disp": True, "maxiter": 10})

# print optimized parameters
model.save("kliff_model.yaml")
mjwen commented 1 year ago

@adityakavalur thanks for the detailed info!

The PyComputeArguments is essentially a C++ object, and we wrapped it using pybind11 to create a python interface. I just checked pybind11 quickly, and it seems it can be made picklable. @yafshar do you have experience with this?

adityakavalur commented 1 year ago

@mjwen thanks for the helpful comment! Looking at pickling example in pybind11 here. It looks like they are also using savestate and getstate as I mentioned above but they are adding all the members of the object to a tuple and sending that across, so there is no expected loss of information. I hadnt thought to do that. Let me see if I can get this to work, there will still be a level of non-generality to the new function parmap21 but its a lot lesser than what I initially thought