PhasesResearchLab / pySIPFENN

Python python toolset for Structure-Informed Property and Feature Engineering with Neural Networks. It offers unique advantages through (1) effortless extensibility, (2) optimizations for ordered, dilute, and random atomic configurations, and (3) automated model tuning.
https://pysipfenn.org
Other
19 stars 3 forks source link
computational-materials-science computational-thermodynamics featurization materials materials-genome materials-informatics

pySIPFENN

GitHub top language PyPI - Python Version License: LGPL v3 PyPI - Version PyPI - Downloads

Core Linux (Ubuntu) Core Mac M1 Core Mac Intel Core Windows Full Test codecov

stable latest Static Badge

DOI DOI Arxiv

Summary

This repository contains python toolset for Structure-Informed Property and Feature Engineering with Neural Networks which implements a numer of user-friendly tools for:

The underlying methodology, efficiency optimizations, design choices, and implementation specifics are given in the following publications:

A more complete (and verbose) description of capabilities is given in documentation at (pysipfenn.org). You may also consider visiting our Phases Research Lab website at (phaseslab.org).

Recent News:

Main Schematic

The figure below is the main schematic of pySIPFENN framework detailing the interplay of internal components. The user interface provides a high-level API to process structural data within core.Calculator, pass it to featurization submodules in descriptorDefinitions to obtain vector representation, then passed to models defined in models.json and (typically) run automatically through all available models. All internal data of core.Calculator is accessible directly, enabling rapid customization. An auxiliary high-level API enables advanced users to operate and retrain the models.

Main Schematic Figure

Applications

pySIPFENN is a very flexible tool that can, in principle, be used for the prediction of any property of interest that depends on an atomic configuration with very few modifications. The models shipped by default are trained to predict formation energy because that is what our research group is interested in; however, if one wanted to predict Poisson’s ratio and trained a model based on the same features, adding it would take minutes. Simply add the model in open ONNX format and link it using the models.json file, as described in the documentation.

Real-World Examples

In our line of work, pySIPFENN and the formation energies it predicts are usually used as a computational engine that generates proto-data for creation of thermodynamic databases (TDBs) using ESPEI (https://espei.org). The TDBs are then used through pycalphad (https://pycalphad.org) to predict phase diagrams and other thermodynamic properties.

Another of its uses in our research is guiding the Density Functional Theory (DFT) calculations as a low-cost screening tool. Their efficient conjunction then drives the experiments leading to discovery of new materials, as presented in these two papers:

Installation

Installing pySIPFENN is simple and easy by utilizing PyPI package repository, conda-forge package repository, or by cloning from GitHub directly. While not required, it is recommended to first set up a virtual environment using venv or Conda. This ensures that (a) one of the required versions of Python (3.9+) is used and (b) there are no dependency conflicts. If you have Conda installed on your system (see miniconda install instructions), you can create a new environment with a simple:

conda create -n pysipfenn python=3.10 jupyter numpy 
conda activate pysipfenn

If you are managing a large set of dependencies in your project, you may consider using mamba in place of conda. It is a less mature, but much faster drop-in replacement compatible with existing environments. See micromamba install instructions.

Standard

If your main goal is to run pySIPFENN models, provided by us or any other vendor, you need only a subset of the capabilities of our code, so you can follow with the following install. Simply install pySIPFENN:

Developer Install

If you want to utilize pySIPFENN beyond its core functionalities, for instance, to train new models on custom datasets or to export models in different formats or precisions, you need to install several other dependencies. This can be done by following the from source instructions above but appending the last instruction with dev extras marker.

pip install -e ".[dev]"

Note: pip install "pysipfenn[dev]" will also work, but will be less conveninet for model modifications (which you likely want to do), as all persisted files will be located outside your working directory. You can quickly find where, by calling import pysipfenn; c = pysipfenn.Calculator(); print(c) and Calculator will tell you (amongst other things) where they are.

Contributing

What to Contribute

If you wish to contribute to the development of pySIPFENN you are more than welcome to do so by forking the repository and creating a pull request. As of Spring 2024, we are actively developing the code and we should get back to you within a few days. We are also open to collaborations and partnerships, so if you have an idea for a new feature or a new model, please do not hesitate to contact us through the GitHub issues or by email.

In particular, we are seeking contributions in the following areas:

Rules for Contributing

We are currently very flexible with the rules for contributing, despite being quite opinionated :)

Some general guidelines are:

Cite

If you use pySIPFENN software, please consider citing:

If you are using predictions from pySIPFENN models accessed through OPTIMADE from MPDD, please additionally cite: