Closed maurerv closed 3 months ago
@dingquanyu from your PR at https://github.com/KosinskiLab/AlphaPulldownSnakemake/pull/13 it seemed like you wanted run_multimer_jobs.py to use a single output directory and create subdirectories for each fold according to use_ap_style.
We could extend run_multimer_jobs.py to allow multiple output_paths, but since run_multimer_jobs.py uses the file-based fold specification, where the user might not know the number of folds beforehand, I think having a single output directory makes the most sense
@dingquanyu from your PR at KosinskiLab/AlphaPulldownSnakemake#13 it seemed like you wanted run_multimer_jobs.py to use a single output directory and create subdirectories for each fold according to use_ap_style.
We could extend run_multimer_jobs.py to allow multiple output_paths, but since run_multimer_jobs.py uses the file-based fold specification, where the user might not know the number of folds beforehand, I think having a single output directory makes the most sense
I see. This means in the snakemake pipeline, you will bypass run_multimer_jobs.py and launch run_structure_prediction.py directly with a cluster of jobs ?
Exactly. I added a checkpoint that performs the clustering and then extended the current rule using run_structure_prediction.py to run on each cluster separately. This way we don't need additional rules.
I just pushed these changes for reference bfa71c7ac5d013a0c1aea3b78fc347381a3ca06c
Exactly. I added a checkpoint that performs the clustering and then extended the current rule using run_structure_prediction.py to run on each cluster separately. This way we don't need additional rules.
I just pushed these changes for reference bfa71c7ac5d013a0c1aea3b78fc347381a3ca06c
I see. Thanks for the commit. Now it makes sense to me.
I guess in the case of padding, you may also need to update the
--output_directory
key so that its value is a list in the argument dictionary by extending it to all the sub-folders that should be created in this if block here? e.g. iterate throughall_folds
and append individualpath.join(FLAGD.output_path, <name of the protein complex>)
to a list. https://github.com/KosinskiLab/AlphaPulldown/blob/732baec9a47d3e02658975078ac29a8fbab66a68/alphapulldown/scripts/run_multimer_jobs.py#L125