EichlerLab / smrtsv2

Structural variant caller
MIT License
53 stars 6 forks source link

'Wildcards' object has no attribute 'cpu' Failed to index reference #35

Closed rozaimirazali closed 5 years ago

rozaimirazali commented 5 years ago

I am getting a WorkflowError message when the software tries to index the reference.

Error message:

localrules directive specifies rules that are not present in the Snakefile: ref_run

Building DAG of jobs... Using shell: /usr/bin/bash Provided cluster nodes: 1 Job counts: count jobs 1 ref_all 1 ref_make_ctab 1 ref_make_fai 1 ref_make_sa 1 ref_make_sizes 1 ref_set_fa 6

[Tue Apr 30 09:10:05 2019] localrule ref_set_fa: output: reference/ref.fasta jobid: 3

[Tue Apr 30 09:10:05 2019] Finished job 3. 1 of 6 steps (17%) done

[Tue Apr 30 09:10:05 2019] rule ref_make_ctab: input: reference/ref.fasta output: reference/ref.fasta.ctab jobid: 1

WorkflowError in line 58 of /gpfs/software/genomics/smrtsv2/smrtsv2/rules/reference.snakefile: 'Wildcards' object has no attribute 'cpu' Failed to index reference

I am using a LSF scheduler and based on the Snakemake guide - [https://snakemake.readthedocs.io/en/stable/snakefiles/configuration.html#cluster-configuration] - I use 'nCPUs' instead of 'cpu' in my cluster.json file for example.

If I change the nCPUs to cpu in my cluster.json file, I will get a drmaa invalid attribute error message instead:

drmaa.errors.InvalidAttributeFormatException: code 13: Error in native specification: -V -cwd -j y -o ./log.txt -pe serial 32 -l mfree=4G -l h_rt=01:00:00 -l gpfsstate=0 -w n -S /bin/bash Failed to index reference

rozaimirazali commented 5 years ago

I edited the args.py file in smrtsvlib,

args_dict['cluster_params'] = { 'default': ' -V -cwd -j y -o ./{log} ' '-pe serial {{cluster.nCPUs}} ' '-l mfree={{cluster.memory}} ' '-l gpfsstate=0 ' '-w n -S /bin/bash', 'help': 'Cluster scheduling parameters with place-holders as {{cluster.XXX}} for parameters in the cluster configuration file (--cluster-config) and {log} for the log directory where standard output from cluster jobs is written.' }

from cluster.cpu to cluster.nCPUs and from cluster.mem to cluster.memory

Accordingly, I also update the cluster.json file so that it uses 'nCPUs' instead of 'cpu' and 'memory' instead of 'mem'

However, I still get the same error message

paudano commented 5 years ago

There are several things going on here.

First, I would pull a new copy of SMRT-SV and build dep (cd dep; make). Yours must be pretty old if it's trying to make the ctab reference index. That was removed a while ago. It's not causing this problem, but there are other issues that have since been fixed.

Do not change the parameter keywords "mem", "cpu", "rt", and "params". Values get parsed into those placeholders before it gets to LSF, so that's not the problem.

I think you are missing the --cluster-config argument (see below), so there is nothing to parse into those placeholders. The cluster configuration file we use is cluster.eichler.json (in the SMRT-SV directory), and you may need to tune that for your environment. When you take the "cluster_params" string and parse values from the cluster config, you should end up with a valid job parameter string for LSF.

Here is an example

SMRTSV_DIR=/path/to/smrtsv
REF_FA=/path/to/reference.fasta
FOFN_FILE=/path/to/input_reads.fofn
DRMAA_LIB=/path/to/libdrmaa.so

${SMRTSV_DIR}/smrtsv --cluster-config ${SMRTSV_DIR}/cluster.eichler.json --drmaalib ${DRMAA_LIB} --distribute run --batches 20 --runjobs "25,20,200,10" --threads 8 ${REF_FA} ${FOFN_FILE}

Let me know if this gets you past that error.