bhattlab / MGEfinder

A toolbox for identifying mobile genetic element (MGE) insertions from short-read sequencing data of bacterial isolates.
MIT License
109 stars 16 forks source link

job pair error #2

Closed yingeddi2008 closed 4 years ago

yingeddi2008 commented 4 years ago

Hi Matthew,

This is me again~ Hope everything is going well with you! Finally, we have some meaningful data to run through your software, but it is giving me some job pair error message.

I set up my folders as the following:

├── 00.assembly
│   ├── ST1_19.fna
│   ├── ST1_20.fna
│   └── ST1_6.fna
├── 00.bam
│   ├── ST1_19.ST1_12.bam
│   ├── ST1_19.ST1_12.bam.bai
│   ├── ST1_20.ST1_12.bam
│   ├── ST1_20.ST1_12.bam.bai
│   ├── ST1_6.ST1_12.bam
│   └── ST1_6.ST1_12.bam.bai
└── 00.genome
    └── ST1_12.fna

The error message are the following. I hope it is not about how I named the isolates.

Error in job pair while creating output file workdir/01.mgefinder/ST1_12/ST1_6/02.pair.ST1_6.ST1_12.tsv.
RuleException:
CalledProcessError in line 106 of /home/dfi_user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/mgefinder/workflow/Snakefile:
Command '
        mgefinder pair -maxdr 20 -minq 20 -minial 21 -maxjsp 0.15         -lins 30 workdir/01.mgefinder/ST1_12/ST1_6/01.find.ST1_6.ST1_12.tsv workdir/00.bam/ST1_6.ST1_12.bam workdir/00.genome/ST1_12.fna -o workdir/01.mgefinder/ST1_12/ST1_6/02.pair.ST1_6.ST1_12.tsv &> workdir/01.mgefinder/ST1_12/ST1_6/log/ST1_6.ST1_12.pair.log
        ' returned non-zero exit status 1.
  File "/home/dfi_user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/mgefinder/workflow/Snakefile", line 106, in __rule_pair
  File "/home/dfi_user/miniconda3/envs/mgefinder/lib/python3.6/concurrent/futures/thread.py", line 56, in run
Will exit after finishing currently running jobs.
Finished job 28.
6 of 31 steps (19%) done
Will exit after finishing currently running jobs.
Finished job 29.
7 of 31 steps (23%) done
Will exit after finishing currently running jobs.
Exiting because a job execution failed. Look above for error message
Traceback (most recent call last):
  File "/home/dfi_user/miniconda3/envs/mgefinder/bin/mgefinder", line 8, in <module>
    sys.exit(cli())
  File "/home/dfi_user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/home/dfi_user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/home/dfi_user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/dfi_user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/dfi_user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/dfi_user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/mgefinder/main.py", line 47, in workflow
    _workflow(workdir, snakefile, configfile, cores, memory, unlock, rerun_incomplete, keep_going)
  File "/home/dfi_user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/mgefinder/workflow.py", line 19, in _workflow
    shell(cmd)
  File "/home/dfi_user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/snakemake/shell.py", line 88, in __new__
    raise sp.CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'snakemake -s /home/dfi_user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/mgefinder/workflow/Snakefile --config wd=workdir/ memory=16000 --cores 4 --configfile /home/dfi_user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/mgefinder/workflow/config.yml' returned non-zero exit status 1.

I hope those are helpful! Thanks in advance. Let me know if you want a copy of the files to reproduce the error message.

Eddi

durrantmm commented 4 years ago

Hi Eddi, sorry to hear you encountered another error.

Could you show me the contents of the file workdir/01.mgefinder/ST1_12/ST1_6/log/ST1_6.ST1_12.pair.log mentioned in the error report?

Thank you!

yingeddi2008 commented 4 years ago

I wonder if I use the reference and the isolate in an opposite way. ST1_12 is the isolate that does not have the inserted elements, while ST1_6 has. Also, a little background, the reference genome ST1_12 is from a hybrid assembly of nanopore reads and Illumina reads. The similarity between ST1_12 and ST1_6 are above 99%. Would that be a problem?

Here is the content of the log:

#### PARAMETERS ###
command: pair
findfile: workdir/01.mgefinder/ST1_12/ST1_6/01.find.ST1_6.ST1_12.tsv
bamfile: workdir/00.bam/ST1_6.ST1_12.bam
genome: workdir/00.genome/ST1_12.fna
max_direct_repeat_length: 20
min_alignment_quality: 20
min_alignment_inner_length: 21
max_junction_spanning_prop: 0.15
large_insertion_cutoff: 30
output_file: workdir/01.mgefinder/ST1_12/ST1_6/02.pair.ST1_6.ST1_12.tsv
###################
Finding all flank pairs within 20 bases of each other ...
Finding all inverted repeats at termini in 9 candidate pairs...
Assigning pairs according to existence of inverted repeats, read count difference, and flank length difference...
Filtering out pairs with evidence of reads spanning both clipped junctions...
Traceback (most recent call last):
  File "/home/dfi_user/miniconda3/envs/mgefinder/bin/mgefinder", line 8, in <module>
    sys.exit(cli())
  File "/home/dfi_user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/home/dfi_user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/home/dfi_user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/dfi_user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/dfi_user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/dfi_user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/mgefinder/main.py", line 112, in pair
    min_alignment_inner_length, max_junction_spanning_prop, large_insertion_cutoff, output_file)
  File "/home/dfi_user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/mgefinder/pair.py", line 40, in _pair
    flank_pairs = flank_pairer.run_pair_flanks()
  File "/home/dfi_user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/mgefinder/pair.py", line 101, in run_pair_flanks
    analyzed_pairs = self.count_insertion_spanning_reads(assigned_pairs)
  File "/home/dfi_user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/mgefinder/pair.py", line 305, in count_insertion_spanning_reads
    reads = self.get_reads_at_site(contig, pos_3p, pos_5p, self.bam, contig_lengths)
  File "/home/dfi_user/miniconda3/envs/mgefinder/lib/python3.6/site-packages/mgefinder/pair.py", line 330, in get_reads_at_site
    if end > contig_lengths[contig]:
KeyError: 1
durrantmm commented 4 years ago

Ok, thanks for that. I haven't seen this error before. Can you please share your working directory with me so I can reproduce the problem? You can send it to my email.

durrantmm commented 4 years ago

Ok, I fixed the issue. It was caused by the fact that each contig in your reference genome was named with only an integer (1 and 2). I fixed this bug in the code and it should now run smoothly for you.

You will just need to reinstall the mgefinder python package.

First, make sure that you are in the mgefinder conda environment.

Next, run the following commands:

pip uninstall mgefinder
pip install --no-cache mgefinder

This should install mgefinder v1.0.1 with the bug patch. Let me know if that works. Thank you for bringing this to my attention!