Is there some parameters that I can run it concurrently？

forrwill commented 6 years ago

I was running the smrtsv2, and I found that it is one-by-one running with snakemake on sge. it is too slowly. Is there anyone can tell me that is there something is wrong?

paudano commented 6 years ago

SMRT-SV is currently designed to distribute over an SGE cluster using DRMAA (https://pypi.org/project/drmaa/). smrtsv.py passes several parameters through to Snakemake. If you can distribute a Snakemake pipeline through DRMAA, then you can set the same parameters to distribute SMRT-SV.

Key parameters:

--distribute
- Tells SMRT-SV to submit jobs and distribute them.
--cluster-params
- A string specific to your cluster. It should have "{{cluster.cpu}}", "{{cluster.mem}}", "{{cluster.rt}}", and "{{cluster.params}}". The values for those placeholders are read from the cluster configuration file (see --cluster-config). This string MUST begin with a space.
--cluster-config
- Path to the cluster configuration file. This tells SGE what the resource requirements are for each rule. In the Eichler lab, we use "cluster.eichler.json". You may need to customize this file for your cluster, but use it as a guide.
--drmaalib </path/to/libdrmaa.so.1.0>
- If DRMAA_LIBRARY_PATH is not set in the environment, then this option is required. It is the full path to libdrmaa.so.1.0.
--jobs
- Number of jobs to run in the current stage (align, detect, assemble, or call). The run command ignores this parameter.
--runjobs <align,detect,assemble,call>
- Sets the number of jobs for each stage in the order listed above. For example, "25,20,120,10" would run 25 alignment jobs, 20 detect jobs, 120 assemblies, and 10 call jobs concurrently.

Note that all these parameters should appear on the command line before the SMRT-SV command to be run. The only exception is --runjobs, which must appear after run (see below).

For example:

SMRTSV_DIR=/path/to/smrtsv2

${SMRTSV_DIR}/smrtsv.py \
  --cluster-config ${SMRTSV_DIR}/cluster.eichler.json \
  --drmaalib /usr/lib/libdrmaa.so.1.0 \
  --distribute \
  run \
  --runjobs "25,20,120,10" \
  /path/to/reference.fasta \
  /path/to/reads.fofn

Most of these options are passed directly to Snakemake, so you can look at Snakemake arguments for more information. If it's useful, smrtsvlib/args.py has the options and their default values, and run_snake_target() in smrtsvlib/smrtsvrunner.py builds the command-line options and feeds them to Snakemake.