cbg-ethz / V-pipe

V-pipe is a pipeline designed for analysing NGS data of short viral genomes
https://cbg-ethz.github.io/V-pipe/
Apache License 2.0
129 stars 45 forks source link

predicthaplo does not work on sars-cov-2 tutorial #135

Closed Masterxilo closed 1 year ago

Masterxilo commented 1 year ago

Hi there. So apparently that data would not have enough divergence, so a crash with haploclique is expected.

But predicthaplo should apparently work but does not:

Removing output files of failed job predicthaplo since they might be corrupted:
samples/SRR10903401/20200102/variants/global/REF_aln.sam
Configuration:
  prefix = samples/SRR10903402/20200102/variants/global/predicthaplo/
  cons = /home/ubuntu/new-vpipe-haplotype-recon-experiments/V-pipe/workflow/../resources/sars-cov-2/NC_045512.2.fasta
  visualization_level = 1
  FASTAreads = samples/SRR10903402/20200102/variants/global/REF_aln.sam
  have_true_haplotypes = 0
  FASTAhaplos = 
  do_local_Analysis = 1
After parsing the reads in file samples/SRR10903402/20200102/variants/global/REF_aln.sam: average read length= -nan 0
First read considered in the analysis starts at position 100000. Last read ends at position 0
There are 0 reads
/usr/bin/bash: line 3: 25922 Segmentation fault      predicthaplo --sam samples/SRR10903402/20200102/variants/global/REF_aln.sam --reference /home/ubuntu/new-vpipe-haplotype-recon-experiments/V-pipe/workflow/../resources/sars-cov-2/NC_045512.2.fasta --prefix samples/SRR10903402/20200102/variants/global/predicthaplo/ --have_true_haplotypes 0 --min_length 0 2> >(tee -a samples/SRR10903402/20200102/variants/global/predicthaplo.err.log >&2)
[Fri Nov 18 11:34:03 2022]
Error in rule predicthaplo:
    jobid: 22
    input: samples/SRR10903402/20200102/alignments/REF_aln.bam, /home/ubuntu/new-vpipe-haplotype-recon-experiments/V-pipe/workflow/../resources/sars-cov-2/NC_045512.2.fasta
    output: samples/SRR10903402/20200102/variants/global/REF_aln.sam, samples/SRR10903402/20200102/variants/global/predicthaplo_haplotypes.fasta
    log: samples/SRR10903402/20200102/variants/global/predicthaplo.out.log, samples/SRR10903402/20200102/variants/global/predicthaplo.err.log (check log file(s) for error message)
    conda-env: /home/ubuntu/new-vpipe-haplotype-recon-experiments/work-sars-cov-2-example/.snakemake/conda/648dc97f886b8633756d6cd60de0ff7c_
    shell:

            samtools sort -n samples/SRR10903402/20200102/alignments/REF_aln.bam -o samples/SRR10903402/20200102/variants/global/REF_aln.sam 2> >(tee samples/SRR10903402/20200102/variants/global/predicthaplo.err.log >&2)

            predicthaplo                 --sam samples/SRR10903402/20200102/variants/global/REF_aln.sam                 --reference /home/ubuntu/new-vpipe-haplotype-recon-experiments/V-pipe/workflow/../resources/sars-cov-2/NC_045512.2.fasta                 --prefix samples/SRR10903402/20200102/variants/global/predicthaplo/                 --have_true_haplotypes 0                 --min_length 0                 2> >(tee -a samples/SRR10903402/20200102/variants/global/predicthaplo.err.log >&2)

            # TODO: copy over actual haplotypes
            touch samples/SRR10903402/20200102/variants/global/predicthaplo_haplotypes.fasta

        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Removing output files of failed job predicthaplo since they might be corrupted:
samples/SRR10903402/20200102/variants/global/REF_aln.sam
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2022-11-18T113259.947175.snakemake.log

I just followed the tutorial https://github.com/cbg-ethz/V-pipe/blob/master/docs/tutorial_sarscov2.md and configured in config.yaml

general:
    virus_base_config: 'sars-cov-2'
    # e.g: 'hiv', 'sars-cov-2', or absent

    # the tool selected as haplotype_reconstruction does the global haplotype reconstruction
    haplotype_reconstruction: predicthaplo

output:
    # enable global haplotype reconstruction
    # might not work with this data...
    #
    # > nope, this data does not support haplotype reconstruction, not enough divergence?...
    global: true

the rest is the default options

DrYak commented 1 year ago

Was: problem creating output directory. fixed in #142