recursionpharma / gflownet

GFlowNet library specialized for graph & molecular data
MIT License
215 stars 43 forks source link
deep-learning gflownet graph-neural-network pytorch

Build-and-Test Code Quality Python versions license: MIT

gflownet

GFlowNet-related training and environment code on graphs.

Primer

GFlowNet [1], [2], [3], short for Generative Flow Network, is a novel generative modeling framework, particularly suited for discrete, combinatorial objects. Here in particular it is implemented for graph generation.

The idea behind GFN is to estimate flows in a (graph-theoretic) directed acyclic network. The network represents all possible ways of constructing objects, and so knowing the flow gives us a policy which we can follow to sequentially construct objects. Such a sequence of partially constructed objects is a trajectory. Perhaps confusingly, the network in GFN refers to the state space, not a neural network architecture.

The main focus of this library (although it can do other things) is to construct graphs (e.g. graphs of atoms), which are constructed node by node. To make policy predictions, we use a graph neural network. This GNN outputs per-node logits (e.g. add an atom to this atom, or add a bond between these two atoms), as well as per-graph logits (e.g. stop/"done constructing this object").

This library supports a variety of GFN algorithms (as well as some baselines), and supports training on a mix of existing data (offline) and self-generated data (online), the latter being obtained by querying the model sequentially to obtain trajectories.

Installation

PIP

This package is installable as a PIP package, but since it depends on some torch-geometric package wheels, the --find-links arguments must be specified as well:

pip install -e . --find-links https://data.pyg.org/whl/torch-2.1.2+cu121.html

Or for CPU use:

pip install -e . --find-links https://data.pyg.org/whl/torch-2.1.2+cpu.html

To install or depend on a specific tag, for example here v0.0.10, use the following scheme:

pip install git+https://github.com/recursionpharma/gflownet.git@v0.0.10 --find-links ...

If package dependencies seem not to work, you may need to install the exact frozen versions listed requirements/, i.e. pip install -r requirements/main-3.10.txt.

Getting started

A good place to get started immediately is with the sEH fragment-based MOO task. The file seh_frag_moo.py is runnable as-is (although you may want to change the default configuration in main()).

For a gentler introduction to the library, see Getting Started. For a more in-depth look at the library, see Implementation Notes.

Repo overview

See implementation notes for more.

Developing & Contributing

External contributions are welcome.

To install the developers dependencies

pip install -e '.[dev]' --find-links https://data.pyg.org/whl/torch-2.1.2+cu121.html

We use tox to run tests and linting, and pre-commit to run checks before committing. To ensure that these checks pass, simply run tox -e style and tox run to run linters and tests, respectively.

For more information, see Contributing.