Closed rvhonorato closed 2 years ago
I ended up changing the environment.yml
to
name: pyscreener_env
channels:
- conda-forge
- defaults
dependencies:
- pip
- python=3.8
- openbabel
- openmm
- rdkit
- pip:
- colorama
- configargparse
- git+https://github.com/openmm/pdbfixer.git
- h5py
- numpy
- ray
- pandas
- pytest
- scikit_learn
- scipy
- seaborn
- tqdm
and then:
$ conda env create -f environment.yml
$ conda activate pyscreener_env
$ pip install .
Seems to have done the trick:
$ pyscreener -h
usage: pyscreener [-h] [--config CONFIG] [--version] [--smoke-test] [-o OUTPUT_DIR] [--no-sort] [--collect-all] [-v] [--preprocessing-options {pdbfix,filter} [{pdbfix,filter} ...]] [--pH PH] [-s SMIS [SMIS ...]] [-i INPUT_FILES [INPUT_FILES ...]] [--input-filetypes INPUT_FILETYPES [INPUT_FILETYPES ...]]
[--no-title-line] [--smiles-col SMILES_COL] [--name-col NAME_COL] [--id-property ID_PROPERTY] [--use-3d] [--optimize] --screen-type {dock,dock6,ucsfdock,vina,qvina,smina,psovina} [--receptors RECEPTORS [RECEPTORS ...]] [--center CENTER_X CENTER_Y CENTER_Z] [--size SIZE_X SIZE_Y SIZE_Z]
[--metadata-template METADATA_TEMPLATE] [--pdbids PDBIDS [PDBIDS ...]] [--docked-ligand-file DOCKED_LIGAND_FILE] [--buffer BUFFER] [-nc NCPU] [--base-name BASE_NAME] [--score-mode {best,avg,boltzmann,top-k}] [--repeat-score-mode {best,avg,boltzmann,top-k}]
[--ensemble-score-mode {best,avg,boltzmann,top-k}] [--repeats REPEATS] [-k K] [--postprocessing-options {visualize} [{visualize} ...]] [--hist-mode {image,text}]
Automate virtual screening of compound libraries.
optional arguments:
-h, --help show this help message and exit
--config CONFIG filepath of a configuration file to use
--version show program's version number and exit
--smoke-test whether to perform a smoke test by checking if the environment is set up properly
-o OUTPUT_DIR, --output-dir OUTPUT_DIR
the path of the output directory
--no-sort do not sort the output scores CSV file by score
--collect-all whether all prepared input files and generated output files should be collected to the final output directory. By default, these files are all stored in a node-local temporary directory that is inaccessible after program completion.
-v, --verbose the level of output this program should print
--preprocessing-options {pdbfix,filter} [{pdbfix,filter} ...]
the preprocessing options to apply
--pH PH the pH for which to calculate protonation state for protein and ligand residues
-s SMIS [SMIS ...], --smis SMIS [SMIS ...]
the SMILES strings of the ligands to dock
-i INPUT_FILES [INPUT_FILES ...], --input-files INPUT_FILES [INPUT_FILES ...]
the filenames containing ligands to dock
--input-filetypes INPUT_FILETYPES [INPUT_FILETYPES ...]
the filetype of each input ligand. If unspecified, will attempt to determine the filetype for each file.
--no-title-line whether there is no title line in the ligands CSV file
--smiles-col SMILES_COL
the column containing the SMILES strings in the CSV file.
--name-col NAME_COL UNUSED the column containing the molecule names/IDs in the CSV file. Molecules will be labeled as ligand_<i> otherwise.
--id-property ID_PROPERTY
UNUSED the name of the property containing the molecule names/IDs in a SMI or SDF file (e.g., "CatalogID", "Chemspace_ID", "Name", etc.). Molecules will be labeled as ligand_<i> otherwise.
--use-3d whether to use the input 3D geometry of each molecule. Note that, in principle, initial geometry of input molecules to flexible docking simulations is statisically insignificant. This option is useful for presevering tautomeric information about input molecules.
--optimize whether the geometry of each molecule should be optimized using the RDKit MMFF94 forcefield first. Note that, in principle, initial geometry of input molecules to flexible docking simulations is statisically insignificant.
--screen-type {dock,dock6,ucsfdock,vina,qvina,smina,psovina}
the type of docking screen to perform
--receptors RECEPTORS [RECEPTORS ...]
the filenames of the receptors
--center CENTER_X CENTER_Y CENTER_Z
the x-, y-, and z-coordinates of the center of the docking box
--size SIZE_X SIZE_Y SIZE_Z
the x-, y-, and z-radii of the docking box
--metadata-template METADATA_TEMPLATE
--pdbids PDBIDS [PDBIDS ...]
the PDB IDs of the crystal structures to dock against
--docked-ligand-file DOCKED_LIGAND_FILE
the filepath of a PDB file containing the docked pose of a ligand from which to automatically construct a docking box
--buffer BUFFER the amount of buffer space to add around the docked ligand when calculating the docking box
-nc NCPU, --ncpu NCPU
--base-name BASE_NAME
--score-mode {best,avg,boltzmann,top-k}
The method used to calculate the score of a single docking run on a single receptor
--repeat-score-mode {best,avg,boltzmann,top-k}
The method used to calculate the overall score from repeated docking runs
--ensemble-score-mode {best,avg,boltzmann,top-k}
The method used to calculate the overall score from an ensemble of docking runs
--repeats REPEATS the number of times to repeat each docking run
-k K the number of top scores to average if using a top-k score mode
--postprocessing-options {visualize} [{visualize} ...]
the postprocessing options to apply
--hist-mode {image,text}
the type of histogram to generate. "image" makes a histogram that is output as a PNG file and "text" generates a histogram using terminal output.
Args that start with '--' (eg. --version) can also be set in a config file (specified via --config). Config file syntax allows: key=value, flag=true, stuff=[a,b,c] (for details, see syntax at https://goo.gl/R74nmi). If an arg is specified in more than one place, then commandline values override config file
values which override defaults
Shouldn't it be simply pip install pyscreener
to get it from pip or python setup.py install
to build it the cloned repository?
I too found that
$ pyscreener-check SCREEN_TYPE METADATA_TEMPLATE
was a bit confusing with the location it appears in the documentation, since at that point I wasn't sure what a SCREEN_TYPE
or METADATA_TEMPLATE
was, but that is explained later.
It may be better to direct users to test their setup with something like
$ pyscreener --config integration-tests/configs/test_vina.ini --smoke-test
Checking environment and metadata for "vina" screen
Checking PATH and environment variables ... PASS
Validating metadata ... PASS
Environment is properly set up!
Since that seems to check everything and --smoke-test
is a pretty good description.
@mikemhenry good suggestion re: --smoke-test
. I'll move towards that approach!
@rvhonorato I would love to pip install pyscreener
, but I was running into issues getting pdbfixer
installed. I was unaware of the conda distribution, so I was requiring users to build it from source. Regardless, to my knowledge, you can't include a git
dependency in a PyPI package. I'm open to feedback on how to make the setup process more streamlined. That's been a consistent critique, but I just can't think of good approaches to handle all the conflicting requirements/depencies. Please let me know what you think will be better (I'll defer to both of your opinions here!)
hi @rvhonorato and @rvhonorato,
this issue has been addressed in the recent commits to main
: c7494f9fb44d0e13d6a8f53a32aa8550cad46ae3..fb4b0e090468937e6bf7f8274b0e173b43fd0de1
I'm closing this for now, but feel free to reopen if it's not sufficient!
Companion of openjournals/joss-reviews/issues/3950
The install instructions are not very clear to me, but after following it step-by-step I tried to check it with:
Which seem like a nice feature, but its not clear what are the
SCREEN_TYPE
andMETADATA_TEMPLATE
and also there is no usage:Could you please clarify this step?
An additional note is that to install the packages you need to first add the
conda-forge
channel with$ conda config --append channels conda-forge