This repository contains a tutorial showing how Parsl can be used to write a machine-learning-guided search for high-performing molecules.
The objective of this application is to identify which molecules have the largest ionization energies (IE, the amount of energy required to remove an electron).
IE can be computed using various simulation packages (here we use xTB); however, execution of these simulations is expensive, and thus, given a finite compute budget, we must carefully select which molecules to explore.
In this example, we use machine learning to predict molecules with high IE based on previous computations (a process often called active learning). We iteratively retrain the machine learning model to improve the accuracy of predictions.
The demo uses a few codes that are easiest to install with Anaconda. Our environment should work on both Linux and OS X can can be installed by:
conda env create --file environment.yml
The notebook steps through the various phases of the workflow.