Open Ferryistaken opened 1 month ago
Great question @Ferryistaken. We have been thinking about ways to make melt-sim more accessible to others. Our initial ideas were around OSG, but running HTCondor locally is another approach.
If you are running HTCondor locally, there are a lot of code and data preparation steps you can skip. You do not need to upload anything to osdf/squid, it can stay on your local machine.
Regarding your last question code/energize.py
specifies the path to Rosetta:
$ python code/energize.py -h
usage: energize.py [-h] [--rosetta_main_dir ROSETTA_MAIN_DIR]
[--variants_fn VARIANTS_FN] [--chain CHAIN]
[--pdb_dir PDB_DIR]
[--allowable_failure_fraction ALLOWABLE_FAILURE_FRACTION]
[--mutate_default_max_cycles MUTATE_DEFAULT_MAX_CYCLES]
[--relax_repeats RELAX_REPEATS]
[--relax_nstruct RELAX_NSTRUCT]
[--relax_distance RELAX_DISTANCE] [--save_wd]
[--log_dir_base LOG_DIR_BASE] [--cluster CLUSTER]
[--process PROCESS] [--commit_id COMMIT_ID]
this is the run script that executes on the server
optional arguments:
-h, --help show this help message and exit
--rosetta_main_dir ROSETTA_MAIN_DIR
path to the main directory of the rosetta distribution
--variants_fn VARIANTS_FN
path to text file containing protein variants
--chain CHAIN the chain to use from the pdb file
--pdb_dir PDB_DIR directory containing the pdb files referenced in
variants_fn
--allowable_failure_fraction ALLOWABLE_FAILURE_FRACTION
fraction of variants that can fail but still consider
this job successful
--mutate_default_max_cycles MUTATE_DEFAULT_MAX_CYCLES
number of optimization cycles in the mutate step
--relax_repeats RELAX_REPEATS
number of FastRelax repeats in the relax step
--relax_nstruct RELAX_NSTRUCT
number of structures (restarts) in the relax step
--relax_distance RELAX_DISTANCE
distance threshold in angstroms for the residue
selector in the relax step
--save_wd set this flag to save the full working directory for
each variant
--log_dir_base LOG_DIR_BASE
base output directory where log dirs for each run will
be placed
--cluster CLUSTER cluster (when running on HTCondor)
--process PROCESS process (when running on HTCondor)
--commit_id COMMIT_ID
the github commit id corresponding to this version of
the code
@samgelman what other steps can @Ferryistaken skip in the setup?
To add to what @agitter said, you should be able to use a local HTCondor with modifications to the existing framework.
The key files are:
+WantFlocking = true
and +WantGlideIn = true
, which are currently set in the file, won't be needed for a local run. This file also contains some keywords like {osdf_rosetta_distribution}
, {osdf_python_distribution}
, and {transfer_input_files}
. These keywords get replaced with actual values by the condor.py script. These specify file paths to the execute nodes for the packaged Rosetta and Python distributions and additional input files. You may be able to leave these as-is and just specify local paths instead of OSDF paths in osdf_rosetta_distribution.txt and osdf_python_distribution.txtenergize.py
with the appropriate arguments. Depending on how you package and transfer Python and Rosetta, this file will need to be modified. Hopefully this is enough to get started, and if you encounter additional questions, I would be happy to help.
I wanted to add that if you plan to have a local install of Python and Rosetta on each execute node, then you won't need to transfer those from the submit node. You would need to modify the files I listed above, especially run.sh
, to assume that Python/Rosetta are already installed on the execute nodes, rather than needing to be packaged and set up.
Yes, my current architecture involves having the python and rosetta installs on each execute node, and a modification to the run.sh
and energize.sub
to account for that. I might work on making a script and doing a PR if you guys would be interested.
@Ferryistaken I would be interested in having you contribute your solution back to this repo if you get everything working. We would need to decide the best way to organize that based on how many files you modified and how much they changed.
Hello,
I've ran the pipeline without HTCondor up until the
processing results
part (which I assume is not currently possible without running the pipeline in HTCondor unless I write a custom script that takes the non-HTCondorenergize_output
and packages it into a database understandable bymetl
).From my understanding, it's unfeasible to generate a good enough training set without parallelizing the computation of rosetta's energy parameters for all variants. I've setup my own HTCondor instance to which I'm able to connect a few
execute
nodes, and would like to runmetl-sim
on my this cluster. The part that I don't understand is: do I really need to uploadrosetta
andpython
to osdf/squid if I'm running the algorithm only on my own machines? Or is there another way (such as adding the rosetta and python env to allexecute
nodes through mydocker-compose
)? I might be wrong, but it seems like I would only need to upload to squid if I'm connecting to a highly distributed HTCondor cluster to which I don't have admin privileges to right?Where in the scripts are the
osdf
python/rosetta env being accessed? Is there a workaround to skip that step and instead use a local install?