Chemical space exploration is a major task of the hit-finding process during the pursuit of novel chemical entities. Compared with other screening technologies, computational de novo design has become a popular approach to overcome the limitation of current chemical libraries. Here, we reported a de novo design platform named systemic evolutionary chemical space explorer (SECSE). The platform was conceptually inspired by fragment-based drug design, that miniaturized a “lego-building” process within the pocket of a certain target. The key to virtual hits generation was then turned into a computational search problem. To enhance search and optimization, human intelligence and deep learning were integrated. SECSE has the potential in finding novel and diverse small molecules that are attractive starting points for further validation.
Setting up dependencies
python ~=3.9, perl ~=5.32
conda create --name secse -c conda-forge parallel tqdm biopandas openbabel chemprop xlrd=2 pandarallel rdkit=2022.09
conda activate secse
Installing from source
git clone https://github.com/KeenThera/SECSE.git
Setting Environment Variables
export SECSE=/absolute/path/to/SECSE
I'm using AutoDock Vina for docking:
(download here)
export VINA=/absolute/path/to/AutoDockVINA
I'm using AutoDock GPU: (adgpu-v1.5.3_linux_ocl_128wi)
(download here)
export AUTODOCK_GPU=/absolute/path/to/AutoDockGPU
I'm using Gilde for docking (additional installation & license
required):
export SCHRODINGER=/absolute/path/to/SCHRODINGER
I'm using Uni-Dock for docking (need GPU):
compile from Uni-Dock source code (recommand), or download here and add export UNIDOCK=/absolute/path/to/UNIDOCK
Giving execution permissions to the SECSE directory
chmod -R +x /absolute/path/to/SECSE
Input fragments: a tab separated .smi file without header. See demo here.
Parameters in config file:
[general]
_projectcode, project identifier, which will be prefixed to each generated molecule ID, type=str
workdir, working directory, create if not exists, otherwise overwrite, type=str
fragments, file path to seed fragments, smi format, type=str
_num_pergen, number of molecules generated each generation, type=int
_seed_pergen, number of selected seed molecules per generation, default=1000, type=int
_startgen, number of staring generation, if you want to resume the generation, please specify the 'start_gen' as the number corresponding to the last completed generation in your previous run, default=0, type=int
_numgen, number of growing generations, the final generation number will be the sum of start_gen and num_gen, type=int
cpu, number of max invoke CPUs, type=int
gpu, number of max invoke GPU for AutoDock GPU, type=int
_ruledb, path to customized rule in json format, input 0 if use default rule, default=0
[docking]
Parameters when docking by AutoDock Vina:
[prediction]
[properties]
Pattern
, ID
, and Max
, where the ID
should be unique for each SMARTS. You can
refer to the example file subtructure_filter_demo.xls, default=0, type=stringConfig file of a demo case phgdh_demo_vina.ini
Customized rule json template rules.json. Rule ID should be in the form G-001-XXXX, like
G-001-0001, G-001-0002, G-001-0003 ...
Run SECSE
python $SECSE/run_secse.py --config /absolute/path/to/config
Please input the absolute path of the config file here.
Output files
GNU Parallel installation
sudo yum install parallel
sudo apt-get install parallel
python ~=3.9, perl ~=5.32
numpy~=1.24.3, pandas~=1.3.3, xlrd~=2.0.1, pandarallel~=1.5.2, tqdm~=4.65.0, biopandas~=0.4.1, openbabel~=3.1.1, rdkit~ =2022.09, chemprop~=1.5.2, pytorch~=2.0.0+cu117
Linux server with CPUs only also works.
Lu, C.; Liu, S.; Shi, W.; Yu, J.; Zhou, Z.; Zhang, X.; Lu, X.; Cai, F.; Xia, N.; Wang, Y. Systemic Evolutionary Chemical Space Exploration For Drug Discovery. J Cheminform 14, 19 (2022).
https://doi.org/10.1186/s13321-022-00598-4
SECSE is released under Apache License, Version 2.0.
The project is being actively developed, if you have any questions or suggestions, please contact: lu_chong@keenthera.com