CSB5 / lofreq

LoFreq Star: Sensitive variant calling from sequencing data
http://csb5.github.io/lofreq/
Other
97 stars 30 forks source link

Warning: indel calls (before filtering) were made without indel alignment-quality #104

Open rahil19 opened 3 years ago

rahil19 commented 3 years ago

Hi,

I ran the lofreq variant calling with --call-indels option, and I see the above warning thrown during the runtime. This is despite running realignment and InDel qual inclusion prior to running call module of lofreq. The exact wordings of the warning

indel calls (before filtering) were made without indel alignment-quality! Did you forget to indel alignment-quality to your bam-file?

The output does generate INDELs but not sure if I can trust the result after seeing this warning.

Following commands were run:

lofreq viterbi --verbose -f ref.fasta -o sample_lofreq_realign.bam  sample_bwa_mark_dup.bam

lofreq indelqual --verbose --dindel -f ref.fasta -o sample_lofreq_realign_indelq.bam sample_lofreq_realign.bam

samtools sort -o sample_lofreq_realign_indelq_sorted.bam sample_lofreq_realign_indelq.bam

samtools index sample_lofreq_realign_indelq_sorted.bam

lofreq call --call-indels --min-cov 50 -q 30 -Q 30 -m 20 --verbose -f ref.fasta -o sample_lofreq_realign_indelq_call.vcf sample_lofreq_realign_indelq_sorted.bam

The reason for running lofreq indelqual instead of GATK’s BQSR because the sample is virus instead of human.

Mark duplicates was run through Picard using the following code

 java -jar /opt/packages/picard/2.20.2/picard.jar MarkDuplicates \
       I=sample_bwa_sorted.bam \
       O=sample_bwa_mark_dup.bam \
       M=sample_mark_dup_metrics.txt

 samtools index sample_bwa_mark_dup.bam

Please let me know how can I fix this issue.

Thanks, Rahil

Zymergen-phaverty commented 3 years ago

I have been seeing the same warning with BAMs that have been through lofreq indelqual.

andreas-wilm commented 3 years ago

Hi Peter and Rahil,

apologies for the late reply. The alignment qualities are actually inserted when running lofreq alnqual. This will add BAQ (base alignment quality) and the equivalent for indels. The indelqual subcommand only inserts indel qualities (not indel alignment qualities) just like GATK BQSR did in the past (but they stopped doing so with latest versions).

Hope this helps, Andreas

On Wed, 18 Nov 2020 at 00:16, Peter Haverty notifications@github.com wrote:

I have been seeing the same warning with BAMs that have been through lofreq indelqual.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/CSB5/lofreq/issues/104#issuecomment-729036388, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAILSCOHQED4EHWPOZT7PPTSQKOXVANCNFSM4TSG5BTQ .

-- Andreas Wilm andreas.wilm@gmail.com | 0x7C68FBCC

Zymergen-phaverty commented 3 years ago

Oh! I was not using lofreq alqual. The instructions say that alnqual isn't necessary as the base and indel qualities are computed on the fly by lofreq call. Should I be doing lofreq indelqual -> lofreq alqual -> lofreq call instead?

rahil19 commented 3 years ago

Hi Andreas,

I included the lofreq alnqual step before running lofreq call. But It still threw the same warning

WARNING(lofreq_call.c|main_call): 1 indel calls (before filtering) were made without indel alignment-quality! Did you forget to indel alignment-quality to your bam-file?

Please see the updated commands to know how I included alnqual step

lofreq viterbi --verbose -f ref.fasta  sample_bwa_mark_dup.bam | samtools sort - > sample_lofreq_realign.bam 

samtools index sample_lofreq_realign.bam 

lofreq indelqual --verbose --dindel -f ref.fasta sample_lofreq_realign.bam | samtools sort - > sample_lofreq_realign_indelq.bam

samtools index sample_lofreq_realign_indelq.bam

lofreq alnqual -b sample_lofreq_realign_indelq.bam ref.fasta | samtools sort - > sample_lofreq_realign_indelq_alnq.bam

samtools index sample_lofreq_realign_indelq_alnq.bam

lofreq call --call-indels --min-cov 50 -q 30 -Q 30 -m 20 --verbose -f ref.fasta -o sample_lofreq_realign_indelq_alnq_call.vcf sample_lofreq_realign_indelq_alnq.bam

Please let me know if I'm running alnqual incorrectly. If not, why lofreq call throws such warning now?

Thanks, Rahil

andreas-wilm commented 3 years ago

Oh! I was not using lofreq alqual. The instructions say that alnqual isn't necessary as the base and indel qualities are computed on the fly by lofreq call. Should I be doing lofreq indelqual -> lofreq alqual -> lofreq call instead?

Oh yes please. If you want to predict indels then this is indeed the preferred way. Base alignment qualities are computed on the fly, but not indel alignment qualities.

andreas-wilm commented 3 years ago

Hi Rahil,

I included the lofreq alnqual step before running lofreq call. But It still threw the same warning

The required sub-command here is indelqual! :) Having said this, you did include it in the list below:


lofreq viterbi --verbose -f ref.fasta  sample_bwa_mark_dup.bam | samtools sort - > sample_lofreq_realign.bam 

samtools index sample_lofreq_realign.bam 

lofreq indelqual --verbose --dindel -f ref.fasta sample_lofreq_realign.bam | samtools sort - > sample_lofreq_realign_indelq.bam

No need to sort here. This command just adds qualities but doesn't change order of reads.

lofreq alnqual -b sample_lofreq_realign_indelq.bam ref.fasta | samtools sort - > sample_lofreq_realign_indelq_alnq.bam

No need to sort here. This command just adds qualities but doesn't change order of reads.

lofreq call --call-indels --min-cov 50 -q 30 -Q 30 -m 20 --verbose -f ref.fasta -o sample_lofreq_realign_indelq_alnq_call.vcf sample_lofreq_realign_indelq_alnq.bam

I would advise against too many filters (-q30 -Q30 -m20) as this distorts the signal. Error and noise are built into LoFreq's calling method!

After running this series of commands you really shouldn't see the warning anymore. All aligned reads with indels should have "ai"/"ad" tags assigned to them by indelqual

GoncheDanesh commented 9 months ago

Hi @rahil19 , Were you able to solve your issue? I am having the same warnings here even though I ran the commands cited earlier :

OUTPUT_MDUP="${basename}_markdup.bam"
    java -Xmx4g -jar /usr/local/picard-tools-2.19.0/picard.jar MarkDuplicates \
                         I=${bam_file} \
                 O=${OUTPUT_MDUP} \
                 M=sample_mark_dup_metrics.txt

    samtools index $OUTPUT_MDUP
## 1. First realign reads with lofreq viterbi
OUTPUT_VITERBI="${basename}_realign.bam"
lofreq viterbi --verbose -f $REF --verbose $OUTPUT_MDUP | samtools sort - >  $OUTPUT_VITERBI
samtools index $OUTPUT_VITERBI

## 3.2 Then, insert indel quality score into the BAM files
OUTPUT_INDELQUAL_BAM="${basename}_indelqual.bam"
echo "Inserting indel quality score into the BAM file..."
lofreq indelqual --dindel -f $REF --verbose $OUTPUT_VITERBI | samtools sort - > $OUTPUT_INDELQUAL_BAM
samtools index $OUTPUT_INDELQUAL_BAM

## 3.3 Insert base and indel alignment quality with lofreq alnqual
OUTPUT_ALNQUAL="${basename}_alnqual.bam"
lofreq alnqual -b $OUTPUT_INDELQUAL_BAM $REF | samtools sort - > $OUTPUT_ALNQUAL
samtools index $OUTPUT_ALNQUAL  

## 3.4 Run Lofreq - SNP calling
OUTPUT_SNV="${basename}_snvs.vcf"
lofreq call --call-indels --verbose -f $REF -o $OUTPUT_SNV $OUTPUT_ALNQUAL

Thanks in advance for your help.