Closed forrwill closed 5 years ago
SMRT-SV is currently designed to distribute over an SGE cluster using DRMAA (https://pypi.org/project/drmaa/). smrtsv.py
passes several parameters through to Snakemake. If you can distribute a Snakemake pipeline through DRMAA, then you can set the same parameters to distribute SMRT-SV.
Key parameters:
DRMAA_LIBRARY_PATH
is not set in the environment, then this option is required. It is the full path to libdrmaa.so.1.0
.run
command ignores this parameter.Note that all these parameters should appear on the command line before the SMRT-SV command to be run. The only exception is --runjobs
, which must appear after run
(see below).
For example:
SMRTSV_DIR=/path/to/smrtsv2
${SMRTSV_DIR}/smrtsv.py \
--cluster-config ${SMRTSV_DIR}/cluster.eichler.json \
--drmaalib /usr/lib/libdrmaa.so.1.0 \
--distribute \
run \
--runjobs "25,20,120,10" \
/path/to/reference.fasta \
/path/to/reads.fofn
Most of these options are passed directly to Snakemake, so you can look at Snakemake arguments for more information. If it's useful, smrtsvlib/args.py
has the options and their default values, and run_snake_target()
in smrtsvlib/smrtsvrunner.py
builds the command-line options and feeds them to Snakemake.
thanks for your reply and I really appreciate it! then, I have run the pipline on the sge, but it can only submit one job automately. can I submit jobs on the sge concurrently? Thank you!
This is what --jobs
and --runjobs
does, it tell the pipeline how many concurrent jobs to submit. If you set --jobs 100
and there are 1,000 jobs ready to run, it will submit 100 of them. When one finishes, the next job in the available pool is run.
This is what
--jobs
and--runjobs
does, it tell the pipeline how many concurrent jobs to submit. If you set--jobs 100
and there are 1,000 jobs ready to run, it will submit 100 of them. When one finishes, the next job in the available pool is run.
Yes,it is! thank you
I met a question when in the assembly.and It give the bug "variantCaller: command not found",but I can not find it. would you please give me some guides about the question?
variantCaller is the program that runs arrow to polish assemblies. It is part of the "GenomicConsensus" PacBio package. If you don't have it installed somewhere, I would install it through BioConda and place the path to the executable it installs in PATH for SMRT-SV (https://github.com/PacificBiosciences/pbbioconda).
Thank you! but I am really sorry that I can not find the tool of " alignfixup". the log file tell me that "/usr/bin/bash: alignfixup: command not found"
alignfixup
was part of the dist
directory, and I need to pull it back in.
I am about to redo the build system so that SMRT-SV can set up all its dependencies before building, so this should be less painful within a few weeks.
I will let you know when I have an update.
thank you! could you give me some suggestions when using the Illumina data? such as which software results is more accurate?
I just pushed an update with a build system for dependencies. If you CD into "dep" and run "make", it should build everything, including "alignfixup", and place it in "dep/bin". SMRT-SV will search "dep/bin" before any other locations, so if you are using your own installed tools, SMRT-SV will start using the tools installed in "dep" instead (e.g. snakemake, canu, blasr, samtools, etc). I just added this and have not tested it yet.
For short reads, the most current resource I know of is the HGSVC paper: https://www.biorxiv.org/content/early/2018/06/13/193144
Several short-read callers were run and compared in that study. I know there are other papers, but these are not tools I often run.
Let me know if it gets past alignfixup or if you have other problems.
I was running the smrtsv2, and I found that it is one-by-one running with snakemake on sge. it is too slowly. Is there anyone can tell me that is there something is wrong?