EichlerLab / smrtsv2

Structural variant caller
MIT License
53 stars 6 forks source link

DRMAA error #50

Closed zihhuafang closed 4 years ago

zihhuafang commented 4 years ago

Hi,

I have an issue running smrtsv with DRMAA if I put --distrubute and --drmaalib . This is how I ran smrtsv:

${SMRTSV_DIR}/smrtsv --wait-time 60 --tempdir ${TMP} --cluster-config ${SMRTSV_DIR}/cluster.json --cluster-params "bsub -J {{cluster.jobname}} -n {{cluster.cpu}} -W {{cluster.rt}} -R \"rusage[mem={{cluster.mem}}]\"" --drmaalib ${DRMAA_LIBRARY_PATH} run --batches 20 --runjobs "25,20,200,10" --threads 10 ${REF_FA} ${FOFN_FILE}

I have the following error msg:

E #d3ee [ 0.00] * fsd_exc_new(1037,Error in native specification: bsub -J ref_make_sa -n 10 -W 24:00 -R "rusage[mem=4000]",1) Traceback (most recent call last): File "/cluster/work/pausch/fang/smrtsv2/dep/conda/build/envs/python3/lib/python3.6/site-packages/snakemake/__init__.py", line 544, in snakemake export_cwl=export_cwl) File "/cluster/work/pausch/fang/smrtsv2/dep/conda/build/envs/python3/lib/python3.6/site-packages/snakemake/workflow.py", line 667, in execute success = scheduler.schedule() File "/cluster/work/pausch/fang/smrtsv2/dep/conda/build/envs/python3/lib/python3.6/site-packages/snakemake/scheduler.py", line 286, in schedule self.run(job) File "/cluster/work/pausch/fang/smrtsv2/dep/conda/build/envs/python3/lib/python3.6/site-packages/snakemake/scheduler.py", line 302, in run error_callback=self._error) File "/cluster/work/pausch/fang/smrtsv2/dep/conda/build/envs/python3/lib/python3.6/site-packages/snakemake/executors.py", line 951, in run jobid = self.session.runJob(jt) File "/cluster/work/pausch/fang/smrtsv2/dep/conda/build/envs/python3/lib/python3.6/site-packages/drmaa/session.py", line 314, in runJob c(drmaa_run_job, jid, sizeof(jid), jobTemplate) File "/cluster/work/pausch/fang/smrtsv2/dep/conda/build/envs/python3/lib/python3.6/site-packages/drmaa/helpers.py", line 302, in c return f(*(args + (error_buffer, sizeof(error_buffer)))) File "/cluster/work/pausch/fang/smrtsv2/dep/conda/build/envs/python3/lib/python3.6/site-packages/drmaa/errors.py", line 151, in error_check raise _ERRORS[code - 1](error_string) drmaa.errors.InvalidAttributeFormatException: code 13: Error in native specification: bsub -J ref_make_sa -n 10 -W 24:00 -R "rusage[mem=4000]"

Do you know how to fix it?

Originally posted by @d2389758 in https://github.com/EichlerLab/smrtsv2/issues/48#issuecomment-579748794

paudano commented 4 years ago

I can't see specifically why LSF is rejecting the submission, and I have never used it myself. The last line of the error message looks like everything is being parsed correctly (values from --cluster-config into the --cluster-params string). It looks like there's an invalid or improperly formatted parameter LSF.

If you take those same parameters and run a test script with it, does it work? If you take parameters one at a time from the --cluster-params string, can you find out which one it causing the error?

zihhuafang commented 4 years ago

I ran the same parameters with a test script and it worked. I think the problem is the connection between python drmaa and lsf-drmaa. It seems that the strings cannot be formatted correctly.

I ran a test script for python drmaa (from the manual of python-drmaa):

!/usr/bin/env python

import drmaa

def main(): """ Query the system. """ with drmaa.Session() as s: print('A DRMAA object was created') print('Supported contact strings: %s' % s.contact) print('Supported DRM systems: %s' % s.drmsInfo) print('Supported DRMAA implementations: %s' % s.drmaaImplementation) print('Version %s' % s.version)

    print('Exiting')

if name=='main': main()

And I got the following error msg: A DRMAA object was created Supported contact strings: Supported DRM systems: IBM Spectrum LSF 10.1 Supported DRMAA implementations: FedStage DRMAA for LSF 1.1.1 Traceback (most recent call last): File "./test.py", line 17, in main() File "./test.py", line 12, in main print('Version %s' % s.version) TypeError: not all arguments converted during string formatting

It seems like without drmaalib (--distrubute), I cannot submit jobs to cluster. Therefore, snakemake will run on login node, which does not have enough memory for all the jobs. Is there anyway to get around this? I already posted an issue on github of drmaa-python.

zihhuafang commented 4 years ago

I ended up by running align.snakefile alone to allow distribution of all the jobs across the cluster without drmaa.

paudano commented 4 years ago

Right now, it's setup to use DRMAA. I probably could have made this more flexible, but it's not right now. Sorry for the difficulty getting it running! Supporting multiple cluster environments can be tricky.