lich-uct / nonpher

Nonpher: software for design of hard-to-synthesize structures
GNU General Public License v3.0
7 stars 5 forks source link

Nonpher

Nonpher Python package contains functions for generating hard-to-synthesize (HS) structures as well as several functions for calculating molecular complexity. Nonpher utilizes molecular morphing algorithm implemented in the Molpher-lib [https://github.com/lich-uct/molpher-lib] library. In molecular morphing, new structures are iteratively generated by simple structural changes, such as the addition or removal of an atom or a bond. In Nonpher, molecular morphing was optimized so that it yields structures not overly complex, but just right hard-to-synthesize. HS structures generated by Nonpher can be used as negative examples for the training of machine learning classifiers. Molecular morphing approach is described in Hoksza D. et al., J. Cheminform. 2014 Mar 21;6(1):7 [https://dx.doi.org/10.1186/1758-2946-6-7] and Nonpher in Voršilák M and Svozil D., J. Cheminform. 2017 Mar 20;9(1):20 [https://dx.doi.org/10.1186/s13321-017-0206-2].

Instalation

Prerequisities

Supported platforms:

Dependencies

For conda installation due to some issues between packages, Molpher-lib is fixed to 0.0.0b2, RDKit to 2018.3.1 and libboost to 1.65.1. With newer or development version of Molpher-lib, these requirments are not so strict.

Installation with Anaconda

Nonpher is distributed as a conda package. At the moment, this is the preferred way to install and use the library. All you need to do is get the full Anaconda[https://www.anaconda.com/] distribution or its lightweight variant, Miniconda[https://docs.conda.io/en/latest/miniconda.html]. It is essentially a Python distribution, package manager and virtual environment in one and makes setting up a development environment for any project very easy. After installing Anaconda/Miniconda (and environment preparing) you can run the following in the Linux terminal:

conda install -c rdkit -c lich nonpher

Installation with setup.py

Once you have installed RDKit[https://www.rdkit.org/] and Molpher-lib[https://github.com/lich-uct/molpher-lib], you can download/clone Nonpher and install it with the following command:

python setup.py install

Quick start

To generate HS structures, Nonpher requires starting molecules to have ions and charges removed. The input for the Nonpher is a CSV file each line of which consists of input compound ID and input compound SMILES. The output CSV file contains input compound ID, input compound SMILES and output HS compound SMILES. The script is issued by the following command:

$ nonpher [-h] [-H] [INPUT_FILE [OUTPUT_FILE]]

where parameter -H instructs the script to skipping the first line (header) of the input CSV file.

Naturally in/with the right path (where you downloaded Nonpher [NONPHER_REPOSITORY/nonpher]), you can use:

python nonpher.py [-h] [-H] [INPUT_FILE [OUTPUT_FILE]]

Usage in Python

from nonpher import nonpher
morph = nonpher.complex_nonpher("O=C(C)Oc1ccccc1C(=O)O")