DrugEx is an open-source software library for de novo design of small molecules with deep learning generative models in a multi-objective reinforcement learning framework. The package contains multiple generator architectures and a variety of scoring tools and multi-objective optimisation methods. It has a flexible application programming interface and can readily be used via the command line interface [[1](https://pubs.acs.org/doi/10.1021/acs.jcim.3c00434)] (see [Quick Start](#quick-start) to get to work right away). ## History This software is a continuation of the original and incremental work of Liu et al.'s DrugEx [[2](https://doi.org/10.1186/s13321-019-0355-6),[3](https://doi.org/10.1186/s13321-021-00561-9),[4](https://doi.org/10.1186/s13321-023-00694-z)] and is currently developed by [Gerard van Westen's Computational Drug Discovery](https://twitter.com/cddleiden) group in Leiden, Netherlands. The first version of DrugEx [[2](https://doi.org/10.1186/s13321-019-0355-6)] consisted of a recurrent neural network (RNN) single-task agent of gated recurrent units (GRU) which were updated to long short-term memory (LSTM) units in the second version [[3](https://doi.org/10.1186/s13321-021-00561-9)], also introducing MOO-based RL and an updated exploitation-exploration strategy. In its third version, [[4](https://doi.org/10.1186/s13321-023-00694-z)] generators based on a variant of the transformer and a novel graph-based encoding allowing for the sampling of molecules with specific substructures were introduced. This package builds on these works and provides a unified API with increased usability and flexibile enough for customization. However, new additional features are beeing added as well [[1](https://pubs.acs.org/doi/10.1021/acs.jcim.3c00434)]. Furthermore, the development and traning of QSAR models, used to score molecules during reinforcement learning has been moved to a separate [QSPRpred](https://github.com/CDDLeiden/QSPRPred)-package, which became a useful library in its own right. ## Workflow The DrugEx package provides classes to standardize, clean and encode molecules for the various deep learning algorithms provided in the package as well as features to set up and monitor training and optimization. The resulting models can be used readily for generation of focused libraries and are easily transferable. ![Fig1](figures/TOC_figure.png) # Quick Start > A small step for exploring the drug space in need, a giant leap for exploiting a healthy state indeed. ## Installation DrugEx can be installed with pip like so: ```bash pip install git+https://github.com/CDDLeiden/DrugEx.git@master ``` ### Optional Dependencies **[QSPRPred](https://github.com/CDDLeiden/QSPRPred.git)** - Optional package to install if you want to use the command line interface of DrugEx, which requires the models to be serialized with this package. It is also used by some examples in the tutorial. Install DrugEx with the following command if you want these features: ```bash pip install "drugex[qsprpred] @ git+https://github.com/CDDLeiden/DrugEx.git@master" ``` **[RAscore](https://github.com/reymond-group/RAscore)** - If you want to use the Retrosynthesis Accessibility Score in the desirability function. - The installation of RAscore might downgrade the scikit-Learn packages. If this happens, scikit-Learn should be re-upgraded. ## Use After installation, you will have access to various command line features, but you can also use the Python API directly. Documentation for the current version of both is available [here](https://cddleiden.github.io/DrugEx/docs/). For a quick start, you can also check out our [Jupyter notebook tutorial](./tutorial), which documents the use of the Python API to build different types of models, or take look at the [CLI examples](https://cddleiden.github.io/DrugEx/docs/use.html#cli-example). The tutorials as well as the documentation are still work in progress, and we will be happy for any contributions where it is still lacking. This repository contains almost all models implemented throughout DrugEx history. We also make the following pretrained models available to be used with this package. You can retrieve them from the following table (not all models are available at this moment, but we will keep adding them):
Model | RNN | SMILES-Based Transformer | Graph-Based Transformer | |||
---|---|---|---|---|---|---|
type | fragmentation | |||||
GRU | LSTM | BRICS | RECAP | BRICS | RECAP | |
ChEMBL 27 | - | Zenodo | - | - | Zenodo | - |
ChEMBL 31 | Zenodo | Zenodo | - | - | Zenodo | - |
Papyrus 05.5 | Zenodo | Zenodo | Zenodo | Zenodo | Zenodo | Zenodo |