Open tomSimonet opened 4 years ago
Hello, thank you for raising this issue. You are correct in that ivar only trims primers from the first region. We will try to fix this within the next 2 weeks.
Currently, specifying -q 0
is the only way to disable quality trimming. We could try to add a flag to disable this.
I recently ran into this issue as well with ivar-1.3.1: I found that only the first reference sequence can be trimmed with ivar trim
.
The logic in ivar.cpp
and trim_primer_quality.cpp
seems a bit strange: the region_
parameter is important to trim_bam_qual_primer()
, but (as far as I can tell) there is no way to set it from the command line so it always defaults to the first reference sequence in the header.
I can work around this issue given our current needs, but trimming only the first reference sequence is a surprising and unfortunate limitation for an otherwise useful tool. Ideally, ivar trim
would be able to trim reads from more than one reference sequence, regardless of order in the header. Short of that, it would be great if this limitation were clearly documented.
In any case, thank you very much for writing the tool!
I'm encountering a slightly different but probably related issue in iVar v1.4.2 when performing primer clipping on a multi-chromosome reference genome (influenza gene segments). While ivar trim
is primer trimming all references in the BAM file (not just the first reference in my case), it is ignoring the first column in the BED file and clipping the same primer coordinates in all references. The effect of this is that iVar is "over-clipping" for primers that don't exist in a particular reference. When I delete primers for all references except one in the BED file, it clips the same coordinates in all references in the BAM file.
hi @sam-baird would you mind sharing an example bam and bed file so I can take a look?
I attached a ZIP with BED files and downsampled BAM files. flu.bam
is the original BAM before primer trimming. flu.trimmed.bam
is from primer trimming using the full BED file:
ivar trim -e -i flu.bam -b AVRL_H5N1_250bpAmpWGS_v2.bed -p flu.trimmed
flu.trimmed_expected.bam
is from primer trimming with a subset of the BED file with only the PP599468-M reference.
Here is an IGV screenshot showing the difference between the three BAM files at the M gene segment coordinates PP599468-M:213-252. The visible letters in the reads indicate soft-clipped bases. The top shows pre-primer trimming where no reads are soft-clipped. The middle shows where extra clipping has occurred using the full BED file. The bottom shows what the clipping should look like (using the BED file with only the PP599468-M reference). In the middle alignment, clipping has expanded beyond the expected end coordinate of 224 of HPAI-MP-250_2_LEFT for most reads, instead ending at position 241, the end coordinate of the NP gene segment primer HPAI-NP-250_2_LEFT which has overlapping coordinates with the MP gene segment primer.
Thanks for taking a look! flu_example.zip
Hello,
I've been trying ivar-1.2.3 trim, but in my hands only the first reference region of my bam file is processed (eg chr1, from a human genome) ; so in the output bam, only reads from this chomosome remain (even with the -e option). Is it the normal behaviour? Have I missed some instruction? I had to make a loop over the chomosomes, changing the bam header at each step, and finally merging the trimmed bams...
./progr/ivar-1.2.3/src/ivar trim -i sample_input.bam -b dna_panel_primers_hg19.bed -m 1 -q 0 -p sample_output.bam > sample_trimP.log 2>&1
also a minor question/request:
as you can see from the command line, I don't want to trim by quality, so I wrote -q 0 ; is it correct? could it be done in a more explicit way (eg "-1 to disable this feature", or with no default value implying nothing is done by default)
many thanks