KolmogorovLab / hapdup

Pipeline to convert a haploid assembly into diploid
Other
90 stars 10 forks source link

Exception: Missing output #34

Open Andreas-Bio opened 1 year ago

Andreas-Bio commented 1 year ago

Any ideas? Pepper output seems empty, nothing below the line with

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample

I think the settings might not be sensitive enough. I just used one gene with a coverage of ~35 to see if it is working. The haplotype divergence is 0.6% (indels). No substitutions.

docker run -v /home/bio/Documents/C2:/home/bio/Documents/C2 -u `id -u`:`id -g` mkolmogo/hapdup:0.12 \
  hapdup --assembly /home/bio/Documents/C2/assembly.fasta --bam /home/bio/Documents/C2/lr_mapping.bam --out-dir /home/bio/Documents/C2/hapdup -t 24 --rtype ont
[2023-07-17 01:37:10] INFO: Filtering alignments
[2023-07-17 01:37:10] INFO: Running: flye-samtools index -@4 /home/bio/Documents/C2/hapdup/filtered.bam
[2023-07-17 01:37:11] INFO: Running: pepper_variant call_variant -b /home/bio/Documents/C2/hapdup/filtered.bam -f /home/bio/Documents/C2/assembly.fasta -o /home/bio/Documents/C2/hapdup/pepper -m /home/bio/Documents/C2/hapdup/pepper/pepper_model.bin -t 24 -s Sample --ont_r9_guppy5_sup --include-supplementary --no_quantized 2>&1 |tee /home/bio/Documents/C2/hapdup/pepper/pepper.log
[07-17-2023 01:37:11] INFO: ONT VARIANT CALLING MODE SELECTED.
[07-17-2023 01:37:11] INFO: MODE: PEPPER SNP
[07-17-2023 01:37:11] INFO: THRESHOLDS ARE SET TO: 
[07-17-2023 01:37:11] INFO: MIN MAPQ:               5
[07-17-2023 01:37:11] INFO: MIN SNP BASEQ:          1
[07-17-2023 01:37:11] INFO: MIN INDEL BASEQ:            1
[07-17-2023 01:37:11] INFO: MIN SNP FREQUENCY:          0.1
[07-17-2023 01:37:11] INFO: MIN INSERT FREQUENCY:       0.15
[07-17-2023 01:37:11] INFO: MIN DELETE FREQUENCY:       0.15
[07-17-2023 01:37:11] INFO: MIN COVERAGE THRESHOLD:     3
[07-17-2023 01:37:11] INFO: MIN CANDIDATE SUPPORT:      2
[07-17-2023 01:37:11] INFO: MIN SNP CANDIDATE FREQUENCY:    0.1
[07-17-2023 01:37:11] INFO: MIN INDEL CANDIDATE FREQUENCY:  0.1
[07-17-2023 01:37:11] INFO: SKIP INDEL CANDIDATES:      False
[07-17-2023 01:37:11] INFO: MAX ALLOWED CANDIDATE IN ONE SITE:  4
[07-17-2023 01:37:11] INFO: MIN SNP PREDICTIVE VALUE:       0.1
[07-17-2023 01:37:11] INFO: MIN INSERT PREDICTIVE VALUE:    0.25
[07-17-2023 01:37:11] INFO: MIN DELETE PREDICTIVE VALUE:    0.25
[07-17-2023 01:37:11] INFO: SNP QV CUTOFF FOR RE-GENOTYPING:    15
[07-17-2023 01:37:11] INFO: INDEL QV CUTOFF FOR RE-GENOTYPING:  10
[07-17-2023 01:37:11] INFO: REPORT ALL SNPs ABOVE THRESHOLD:    0
[07-17-2023 01:37:11] INFO: REPORT ALL INDELs ABOVE THRESHOLD:  0
[07-17-2023 01:37:11] INFO: CALL VARIANT MODULE SELECTED
[07-17-2023 01:37:11] INFO: RUN-ID: 07172023_013711
[07-17-2023 01:37:11] INFO: IMAGE OUTPUT: /home/bio/Documents/C2/hapdup/pepper/images_07172023_013711/
[07-17-2023 01:37:11] INFO: STEP 1/3 GENERATING IMAGES:
[07-17-2023 01:37:11] INFO: COMMON CONTIGS FOUND: ['contig_1']
[07-17-2023 01:37:11] INFO: TOTAL CONTIGS: 1 TOTAL INTERVALS: 1 TOTAL BASES: 7641
[07-17-2023 01:37:11] INFO: STARTING PROCESS: 0 FOR 1 INTERVALS
[07-17-2023 01:37:11] INFO: THREAD 0 FINISHED SUCCESSFULLY.
[07-17-2023 01:37:11] INFO: FINISHED IMAGE GENERATION
[07-17-2023 01:37:11] INFO: TOTAL ELAPSED TIME FOR GENERATING IMAGES: 0 Min 0 Sec
[07-17-2023 01:37:11] INFO: STEP 2/3 RUNNING INFERENCE
[07-17-2023 01:37:11] INFO: OUTPUT: /home/bio/Documents/C2/hapdup/pepper/predictions_07172023_013711/
[07-17-2023 01:37:11] INFO: DISTRIBUTED CPU SETUP.
[07-17-2023 01:37:11] INFO: TOTAL CALLERS: 24
[07-17-2023 01:37:11] INFO: THREADS PER CALLER: 1
[07-17-2023 01:37:11] INFO: MODEL LOADING TO ONNX
[07-17-2023 01:37:11] INFO: SAVING MODEL TO ONNX
/usr/local/lib/python3.8/dist-packages/torch/onnx/symbolic_opset9.py:2095: UserWarning: Exporting a model to ONNX with a batch_size other than 1, with a variable length with LSTM can cause an error when running the ONNX model with a different batch size. Make sure to save the model with a batch size of 1, or define the initial states (h0/c0) as inputs of the model. 
  warnings.warn("Exporting a model to ONNX with a batch_size other than 1, " +
[07-17-2023 01:37:12] INFO: SETTING THREADS TO: 1.
[07-17-2023 01:37:12] INFO: STARTING INFERENCE.
[07-17-2023 01:37:12] INFO: TOTAL SUMMARIES: 0.
[07-17-2023 01:37:12] INFO: THREAD 0 FINISHED SUCCESSFULLY.
[07-17-2023 01:37:12] INFO: FINISHED PREDICTION
[07-17-2023 01:37:12] INFO: ELAPSED TIME: 0 Min 0 Sec
[07-17-2023 01:37:12] INFO: PREDICTION FINISHED SUCCESSFULLY. 
[07-17-2023 01:37:12] INFO: TOTAL ELAPSED TIME FOR INFERENCE: 0 Min 0 Sec
[07-17-2023 01:37:12] INFO: STEP 3/3 FINDING CANDIDATES
[07-17-2023 01:37:12] INFO: OUTPUT: /home/bio/Documents/C2/hapdup/pepper/
[07-17-2023 01:37:12] INFO: STARTING CANDIDATE FINDING.
[07-17-2023 01:37:12] INFO: FINISHED PROCESSING, TOTAL CANDIDATES FOUND: 0
[07-17-2023 01:37:12] INFO: FINISHED PROCESSING, TOTAL VARIANTS IN PEPPER: 0
[07-17-2023 01:37:12] INFO: FINISHED PROCESSING, TOTAL VARIANTS SELECTED FOR RE-GENOTYPING: 0
[07-17-2023 01:37:12] INFO: TOTAL TIME SPENT ON CANDIDATE FINDING: 0 Min 0 Sec
[07-17-2023 01:37:12] INFO: TOTAL ELAPSED TIME FOR FINDING CANDIDATES: 0 Min 1 Sec
[2023-07-17 01:37:12] INFO: Running: margin phase /home/bio/Documents/C2/hapdup/filtered.bam /home/bio/Documents/C2/assembly.fasta /home/bio/Documents/C2/hapdup/pepper/PEPPER_VARIANT_FULL.vcf /opt/margin_params/phase/allParams.haplotag.ont-r94g507.hapDup.json -t 24 -o /home/bio/Documents/C2/hapdup/margin/MARGIN_PHASED 2>&1 |tee /home/bio/Documents/C2/hapdup/margin/margin.log
Running OpenMP with 24 threads.
> Parsing model parameters from file: /opt/margin_params/phase/allParams.haplotag.ont-r94g507.hapDup.json
> Parsed 0 total VCF entries from /home/bio/Documents/C2/hapdup/pepper/PEPPER_VARIANT_FULL.vcf; kept 0 HETs, skipped 0 for region, 0 for not being PASS, 0 for being homozygous, 0 for being INDEL
No valid VCF entries found!
[2023-07-17 01:37:12] ERROR: Missing output: /home/bio/Documents/C2/hapdup/margin/MARGIN_PHASED.haplotagged.bam
Traceback (most recent call last):
  File "/usr/local/bin/hapdup", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.8/dist-packages/hapdup/main.py", line 206, in main
    file_check(haplotagged_bam)
  File "/usr/local/lib/python3.8/dist-packages/hapdup/main.py", line 114, in file_check
    raise Exception("Missing output")
Exception: Missing output
Callithrix-omics commented 1 year ago

I haven't found a solution but I am also having a problem along those lines. my problem stems from filtered bams made by hap_dup not containing anything there. do those bams contain anything on your end?

mikolmogorov commented 1 year ago

Hi both,

Can you give more information about what you are trying to assemble? Hapdup is currently designed for whole-genome long-read assemblies, and the heuristics may not work for "local" phasing of short sequences. Have you tried decreasing --min-aligned-length parameter (it's 10k by default)?