kishwarshafin / pepper

PEPPER-Margin-DeepVariant
MIT License
242 stars 42 forks source link

Running into AttributeError #148

Closed sagnikbanerjee15 closed 2 years ago

sagnikbanerjee15 commented 2 years ago

Hello,

I am trying to execute pepper deepvariant on Nanopore data but I am encountering the following error:

[05-25-2022 02:28:55] INFO: VARIANT CALLING MODULE SELECTED
[05-25-2022 02:28:55] INFO: GVCF OPTION IS ON, FULL RE-GENOTYPING WILL BE PERFORMED.
Traceback (most recent call last):
  File "/usr/local/bin/run_pepper_margin_deepvariant", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.8/dist-packages/run_pepper_margin_deepvariant/run_pepper_margin_deepvariant.py", line 1113, in main
    run_variant_calling(options)
  File "/usr/local/lib/python3.8/dist-packages/run_pepper_margin_deepvariant/run_pepper_margin_deepvariant.py", line 530, in run_variant_calling
    post_processing_command = get_post_process_variant_call(options)
  File "/usr/local/lib/python3.8/dist-packages/run_pepper_margin_deepvariant/run_pepper_margin_deepvariant.py", line 306, in get_post_process_variant_call
    post_processing_command = post_processing_command + "rm -rf " + options.margin_output + ";\n"
AttributeError: 'Namespace' object has no attribute 'margin_output'

The command I am executing is:

run_pepper_margin_deepvariant call_variant  -o pepper_deepvariant_output --only_pepper --ont_r10_q20 -b 580421_offspring1_aligned.sortedByName_cleaned.sortedByPos.bam -f Crbn_CodOpt_PacBio_Amplicon_Sample_580421.fasta -t 32

Could you please look into it?

I have also tried running it without the --only_pepper option. Then I get the following error:

[05-25-2022 02:37:31] INFO: VARIANT CALLING MODULE SELECTED
[05-25-2022 02:37:31] INFO: [1/9] RUNNING THE FOLLOWING COMMAND
-------
mkdir -p pepper_deepvariant_output; 
mkdir -p pepper_deepvariant_output/logs; 
mkdir -p pepper_deepvariant_output/intermediate_files; 
cp /opt/pepper_models/PEPPER_VARIANT_ONT_R10_Q20_V8.pkl pepper_deepvariant_output/intermediate_files
-------
[05-25-2022 02:37:31] INFO: [2/9] RUNNING THE FOLLOWING COMMAND
-------
time pepper_variant call_variant -b 580421_offspring1_aligned.sortedByName_cleaned.sortedByPos.bam -f Crbn_CodOpt_PacBio_Amplicon_Sample_580421.fasta -t 32 -m pepper_deepvariant_output/intermediate_files/PEPPER_VARIANT_ONT_R10_Q20_V8.pkl -o pepper_deepvariant_output/pepper/ --no_quantized  -s Sample --ont_r10_q20 2>&1 | tee pepper_deepvariant_output/logs/1_pepper.log
-------
[05-25-2022 02:37:32] INFO: ONT VARIANT CALLING MODE SELECTED.
[05-25-2022 02:37:32] INFO: MODE: PEPPER
[05-25-2022 02:37:32] INFO: THRESHOLDS ARE SET TO: 
[05-25-2022 02:37:32] INFO: MIN MAPQ:               1
[05-25-2022 02:37:32] INFO: MIN SNP BASEQ:          1
[05-25-2022 02:37:32] INFO: MIN INDEL BASEQ:            1
[05-25-2022 02:37:32] INFO: MIN SNP FREQUENCY:          0.1
[05-25-2022 02:37:32] INFO: MIN INSERT FREQUENCY:       0.1
[05-25-2022 02:37:32] INFO: MIN DELETE FREQUENCY:       0.1
[05-25-2022 02:37:32] INFO: MIN COVERAGE THRESHOLD:     3
[05-25-2022 02:37:32] INFO: MIN CANDIDATE SUPPORT:      2
[05-25-2022 02:37:32] INFO: MIN SNP CANDIDATE FREQUENCY:    0.1
[05-25-2022 02:37:32] INFO: MIN INDEL CANDIDATE FREQUENCY:  0.1
[05-25-2022 02:37:32] INFO: SKIP INDEL CANDIDATES:      False
[05-25-2022 02:37:32] INFO: MAX ALLOWED CANDIDATE IN ONE SITE:  4
[05-25-2022 02:37:32] INFO: MIN SNP PREDICTIVE VALUE:       1e-05
[05-25-2022 02:37:32] INFO: MIN INSERT PREDICTIVE VALUE:    0.001
[05-25-2022 02:37:32] INFO: MIN DELETE PREDICTIVE VALUE:    0.001
[05-25-2022 02:37:32] INFO: SNP QV CUTOFF FOR RE-GENOTYPING:    15
[05-25-2022 02:37:32] INFO: INDEL QV CUTOFF FOR RE-GENOTYPING:  30
[05-25-2022 02:37:32] INFO: REPORT ALL SNPs ABOVE THRESHOLD:    0
[05-25-2022 02:37:32] INFO: REPORT ALL INDELs ABOVE THRESHOLD:  0
[05-25-2022 02:37:32] INFO: LOW COMPLEXITY REGION SETUP:
[05-25-2022 02:37:32] INFO: MIN SNP PREDICTIVE VALUE:       1e-06
[05-25-2022 02:37:32] INFO: MIN INSERT PREDICTIVE VALUE:    0.001
[05-25-2022 02:37:32] INFO: MIN DELETE PREDICTIVE VALUE:    0.001
[05-25-2022 02:37:32] INFO: SNP QV CUTOFF FOR RE-GENOTYPING:    20
[05-25-2022 02:37:32] INFO: INDEL QV CUTOFF FOR RE-GENOTYPING:  35
[05-25-2022 02:37:32] INFO: CALL VARIANT MODULE SELECTED
[05-25-2022 02:37:32] INFO: RUN-ID: 05252022_023732
[05-25-2022 02:37:32] INFO: IMAGE OUTPUT: pepper_deepvariant_output/pepper/images_05252022_023732/
[05-25-2022 02:37:32] INFO: STEP 1/3 GENERATING IMAGES:
[05-25-2022 02:37:32] INFO: COMMON CONTIGS FOUND: ['Crbn_CodOpt_PacBio_Amplicon']
[05-25-2022 02:37:32] INFO: TOTAL CONTIGS: 1 TOTAL INTERVALS: 1 TOTAL BASES: 8198
[05-25-2022 02:37:32] INFO: STARTING PROCESS: 0 FOR 1 INTERVALS
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
[05-25-2022 02:39:08] INFO: THREAD 0 FINISHED SUCCESSFULLY.
[05-25-2022 02:39:09] INFO: FINISHED IMAGE GENERATION
[05-25-2022 02:39:09] INFO: TOTAL ELAPSED TIME FOR GENERATING IMAGES: 1 Min 37 Sec
[05-25-2022 02:39:09] INFO: STEP 2/3 RUNNING INFERENCE
[05-25-2022 02:39:09] INFO: OUTPUT: pepper_deepvariant_output/pepper/predictions_05252022_023732/
[05-25-2022 02:39:09] INFO: DISTRIBUTED CPU SETUP.
[05-25-2022 02:39:09] INFO: TOTAL CALLERS: 32
[05-25-2022 02:39:09] INFO: THREADS PER CALLER: 1
[05-25-2022 02:39:09] INFO: MODEL LOADING TO ONNX
[05-25-2022 02:39:09] INFO: SAVING MODEL TO ONNX
/usr/local/lib/python3.8/dist-packages/torch/onnx/symbolic_opset9.py:2119: UserWarning: Exporting a model to ONNX with a batch_size other than 1, with a variable length with LSTM can cause an error when running the ONNX model with a different batch size. Make sure to save the model with a batch size of 1, or define the initial states (h0/c0) as inputs of the model. 
  warnings.warn("Exporting a model to ONNX with a batch_size other than 1, " +
[05-25-2022 02:39:10] INFO: SETTING THREADS TO: 1.
[05-25-2022 02:39:10] INFO: STARTING INFERENCE.
[05-25-2022 02:39:10] INFO: TOTAL SUMMARIES: 0.
[05-25-2022 02:39:10] INFO: THREAD 0 FINISHED SUCCESSFULLY.
[05-25-2022 02:39:15] INFO: FINISHED PREDICTION
[05-25-2022 02:39:15] INFO: ELAPSED TIME: 0 Min 5 Sec
[05-25-2022 02:39:15] INFO: PREDICTION FINISHED SUCCESSFULLY. 
[05-25-2022 02:39:15] INFO: TOTAL ELAPSED TIME FOR INFERENCE: 0 Min 6 Sec
[05-25-2022 02:39:15] INFO: STEP 3/3 FINDING CANDIDATES
[05-25-2022 02:39:15] INFO: OUTPUT: pepper_deepvariant_output/pepper/
[05-25-2022 02:39:16] INFO: STARTING CANDIDATE FINDING.
[05-25-2022 02:39:16] INFO: FINISHED PROCESSING, TOTAL CANDIDATES FOUND: 0
[05-25-2022 02:39:16] INFO: FINISHED PROCESSING, TOTAL VARIANTS IN PEPPER: 0
[05-25-2022 02:39:16] INFO: FINISHED PROCESSING, TOTAL VARIANTS SELECTED FOR RE-GENOTYPING: 0
[05-25-2022 02:39:16] INFO: FINISHED PROCESSING, TOTAL SNP VARIANTS SELECTED FOR RE-GENOTYPING: 0
[05-25-2022 02:39:16] INFO: FINISHED PROCESSING, TOTAL INDEL VARIANTS SELECTED FOR RE-GENOTYPING: 0
[05-25-2022 02:39:16] INFO: TOTAL TIME SPENT ON CANDIDATE FINDING: 0 Min 0 Sec
[05-25-2022 02:39:16] INFO: TOTAL ELAPSED TIME FOR FINDING CANDIDATES: 1 Min 44 Sec

real    1m45.530s
user    1m44.349s
sys 0m23.182s
[05-25-2022 02:39:16] INFO: [3/9] RUNNING THE FOLLOWING COMMAND
-------
mv pepper_deepvariant_output/pepper/PEPPER_VARIANT_FULL.vcf.gz pepper_deepvariant_output/intermediate_files/; 
mv pepper_deepvariant_output/pepper/PEPPER_VARIANT_FULL.vcf.gz.tbi pepper_deepvariant_output/intermediate_files/; 
mv pepper_deepvariant_output/pepper/PEPPER_VARIANT_OUTPUT_PEPPER.vcf.gz pepper_deepvariant_output/intermediate_files/; 
mv pepper_deepvariant_output/pepper/PEPPER_VARIANT_OUTPUT_PEPPER.vcf.gz.tbi pepper_deepvariant_output/intermediate_files/; 
mv pepper_deepvariant_output/pepper/PEPPER_VARIANT_OUTPUT_VARIANT_CALLING.vcf.gz pepper_deepvariant_output/intermediate_files/; 
mv pepper_deepvariant_output/pepper/PEPPER_VARIANT_OUTPUT_VARIANT_CALLING.vcf.gz.tbi pepper_deepvariant_output/intermediate_files/; 
mv pepper_deepvariant_output/pepper/PEPPER_VARIANT_OUTPUT_VARIANT_CALLING_SNPs.vcf.gz pepper_deepvariant_output/intermediate_files/; 
mv pepper_deepvariant_output/pepper/PEPPER_VARIANT_OUTPUT_VARIANT_CALLING_SNPs.vcf.gz.tbi pepper_deepvariant_output/intermediate_files/; 
mv pepper_deepvariant_output/pepper/PEPPER_VARIANT_OUTPUT_VARIANT_CALLING_INDEL.vcf.gz pepper_deepvariant_output/intermediate_files/; 
mv pepper_deepvariant_output/pepper/PEPPER_VARIANT_OUTPUT_VARIANT_CALLING_INDEL.vcf.gz.tbi pepper_deepvariant_output/intermediate_files/; 
rm -rf pepper_deepvariant_output/pepper/; 
echo "CONTIGS FOUND IN PEPPER VCF:"; 
zcat pepper_deepvariant_output/intermediate_files/PEPPER_VARIANT_FULL.vcf.gz | grep -v '#' | cut -f1 | uniq
-------
CONTIGS FOUND IN PEPPER VCF:
[05-25-2022 02:39:16] INFO: [4/9] RUNNING THE FOLLOWING COMMAND
-------
time margin phase 580421_offspring1_aligned.sortedByName_cleaned.sortedByPos.bam Crbn_CodOpt_PacBio_Amplicon_Sample_580421.fasta pepper_deepvariant_output/intermediate_files/PEPPER_VARIANT_FULL.vcf.gz /opt/margin_dir/params/phase/allParams.haplotag.ont-r94g507.snp.json -t 32 -V -o pepper_deepvariant_output/intermediate_files/PHASED.PEPPER_MARGIN 2>&1 | tee pepper_deepvariant_output/logs/2_margin_haplotag.log;
samtools index -@32 pepper_deepvariant_output/intermediate_files/PHASED.PEPPER_MARGIN.haplotagged.bam
-------
Running OpenMP with 32 threads.
> Parsing model parameters from file: /opt/margin_dir/params/phase/allParams.haplotag.ont-r94g507.snp.json
> Parsed 0 total VCF entries from pepper_deepvariant_output/intermediate_files/PEPPER_VARIANT_FULL.vcf.gz; kept 0 HETs, skipped 0 for region, 0 for not being PASS, 0 for being homozygous, 0 for being INDEL
No valid VCF entries found!

real    0m0.003s
user    0m0.004s
sys 0m0.000s
[E::hts_open_format] Failed to open file "pepper_deepvariant_output/intermediate_files/PHASED.PEPPER_MARGIN.haplotagged.bam" : No such file or directory
samtools index: failed to open "pepper_deepvariant_output/intermediate_files/PHASED.PEPPER_MARGIN.haplotagged.bam": No such file or directory
[05-25-2022 02:39:16] ERROR: None]
[05-25-2022 02:39:16] THE FOLLOWING COMMAND FAILED: time margin phase 580421_offspring1_aligned.sortedByName_cleaned.sortedByPos.bam Crbn_CodOpt_PacBio_Amplicon_Sample_580421.fasta pepper_deepvariant_output/intermediate_files/PEPPER_VARIANT_FULL.vcf.gz /opt/margin_dir/params/phase/allParams.haplotag.ont-r94g507.snp.json -t 32 -V -o pepper_deepvariant_output/intermediate_files/PHASED.PEPPER_MARGIN 2>&1 | tee pepper_deepvariant_output/logs/2_margin_haplotag.log;
samtools index -@32 pepper_deepvariant_output/intermediate_files/PHASED.PEPPER_MARGIN.haplotagged.bam]
kishwarshafin commented 2 years ago

@sagnikbanerjee15 ,

Looks like PEPPER didn't find any variants in the regions defined:

[05-25-2022 02:39:16] INFO: FINISHED PROCESSING, TOTAL CANDIDATES FOUND: 0
[05-25-2022 02:39:16] INFO: FINISHED PROCESSING, TOTAL VARIANTS IN PEPPER: 0
[05-25-2022 02:39:16] INFO: FINISHED PROCESSING, TOTAL VARIANTS SELECTED FOR RE-GENOTYPING: 0
[05-25-2022 02:39:16] INFO: FINISHED PROCESSING, TOTAL SNP VARIANTS SELECTED FOR RE-GENOTYPING: 0
[05-25-2022 02:39:16] INFO: FINISHED PROCESSING, TOTAL INDEL VARIANTS SELECTED FOR RE-GENOTYPING: 0

I see that you are trying pacbio amplicon data. Maybe use the CCS parameter for pepper and not the ont R10?

sagnikbanerjee15 commented 2 years ago

Hello @kishwarshafin,

Thank you for your reply. Is there a way to have pepper_deepvariant output a message instead of failing if it does not find any variants within the region? Also, I am working with NanoPore reads and not PacBio. Should I set some specific parameters for that case?

Is there anything in the results that suggests I used PacBio?

Thank you.

kishwarshafin commented 2 years ago

@sagnikbanerjee15 , Ah, sorry, I read Crbn_CodOpt_PacBio_Amplicon_Sample_580421 and thought the bam was PacBio. One thing you can quickly do is to drop all of the parameters that are set here to increase sensitivity.

[05-25-2022 02:37:32] INFO: MIN SNP FREQUENCY:          0.1
[05-25-2022 02:37:32] INFO: MIN INSERT FREQUENCY:       0.1
[05-25-2022 02:37:32] INFO: MIN DELETE FREQUENCY:       0.1

Currently, not finding a variant at all in the entire bam would result in an error.

You can see the full list of parameters here: https://github.com/kishwarshafin/pepper/blob/r0.8/docs/usage/usage_and_parameters.md