Open kusonahikari opened 3 years ago
Could you be specific about what the differences between your pipeline and viridian are (ideally provide your calls?),and where exactly in the spike you want us to look? You give a plot showing coverage and I am not sure what you want me to take away from that? What does this show in your mind?
I have provided the calls within the Dropbox folder. For the spike region, the region of 72nd primer pairs, 21658 - 22038. Actually, I'm thinking to try again with other samples as well as last time I tried with one sample only (a bad one I guess). For the coverage, I found the low coverage regions was algined with the ambigious regions.
I could only see the two FASTQ files of reads in from dropbox. I assembled them with viridian version 0.1.0. I don't the same as what you are seeing.
The only dropped amplicon was the one at position 20173-20572, which does not overlap with spike. Inspecting the read mapping confirms that there's almost no reads there, so looks correct.
Screenshot is attached of the assembly made by viridian (top) compared to reference genome MN908947.3 (bottom). BLAST hits in red (both are 99% identity). The only Ns in the Viridian consensus are that dropped amplicon, visible as the gap between the two BLAST hits.
Just to comment, Tung is comparing a consensus sequence that went through viridian on GPAS (including human read removal steps in Catsup). I can provide the trimmed fastq's as they're on the OCI bucket if needed.
@OBannis thanks, could you share the trimmed fastqs with me please?
@OBannis thanks, could you share the trimmed fastqs with me please?
Sure have emailed you a link
Comparing the result of the consensus, we found that the result was not identical between viridian and our lab-built ivar pipeline on artic V3 protocol. With viridia, the low depth of coverage regions were greatly influenced, resulting ambiguities, especially the spike region. Down here is our depth of coverage plot after filtered out bad reads. In addition, filtering human read from raw reads would make artifact SNPs from low-frequency alleles. Our pipeline consists of mapping raw read (
bwa-mem
) then filtered bad quality reads (samtools
with-bSq 20
flag ). The filtered ones were then trimmed the amplicon primers set (ivar
with-e
flag). The consensuses were called afterthat.