Closed JennKnapp closed 4 months ago
@JennKnapp Let me look for the issue that was closed, at a glance this looks like there are not enough reads to create a consensus sequence by KMA before calling SNPs.
Connecting #219, @JennKnapp are you able to share the reads with me so that I can debug? Cheers.
Sure, here is a link to a set of the fastq.gz files I was using when I ran into this issue: https://drive.google.com/drive/folders/1LS8cCJMr08nzNGWevZdteio5zcgC1B--?usp=sharing
@JennKnapp running fastqc
on the two samples shows adapter content. Maybe trim before running rgi bwt
?
Some of the sequences also have some "N"s. Feel free to share the trimmed reads and I will test again. Cheers.
@raphenya I've uploaded the trimmed reads to the same drive folder, I removed adapters and low quality bases. The same error messages appear when running rgi bwt on these cleaned up reads though, so hopefully you'll have better luck. Thanks!
@JennKnapp ok, cool. Thanks. I will take a look. Cheers.
@JennKnapp ok, I did the following:
There are still lots of k-mers, but that's to be expected in this type of data.
[W::sam_parse1] mapped query cannot have zero coordinate; treated as unmapped
sam
file to a bam
fileThese are the only differences:
diff kma_headers.txt bowtie2_headers.txt
1,2c1
< @HD VN:1.6 GO:reference
< @PG ID:KMA PN:kma VN:1.4.9 CL:kma -mem_mode -ex_mode -1t1 -vcf -ipe "CB-Shotgun_S119_R1_001.fastq.gz" "CB-Shotgun_S119_R2_001.fastq.gz" -t 20 -t_db /workspace/lab/mcarthurlab/raphenar/issue256/localDB/bwt/card_reference/kma -o "/workspace/lab/mcarthurlab/raphenar/issue256/output_kma.temp.sam.temp" -sam
---
> @HD VN:1.5 SO:unsorted GO:query
4806a4806
> @PG ID:bowtie2 PN:bowtie2 VN:2.5.1 CL:"/var/miniconda3/envs/rgi603/bin/bowtie2-align-s --wrapper basic-0 --quiet --very-sensitive-local --threads 20 -x /workspace/lab/mcarthurlab/raphenar/issue256/localDB/bwt/card_reference/bowtie2 -S /workspace/lab/mcarthurlab/raphenar/issue256/output.temp.sam -1 CB-Shotgun_S119_R1_001.fastq.gz -2 CB-Shotgun_S119_R2_001.fastq.gz"
e.g WARNING 2023-12-08 20:05:03,653 : model with id : 33, has few mapped reads to make consensus sequence skipping: 'ARO:3003109|ID:33|Name:msrE|NCBI:EU294228.1'
These happened only when using KMA aligner as we try to use the reads to create a consensus sequence.
It's not obvious what's causing the [W::sam_parse1] mapped query cannot have zero coordinate; treated as unmapped
as both bowtie2
and kma
have same number of sequences added to their headers. I will do more testing with reads that are not expected to map and that will map and see if I get the same warnings.
Thanks a bunch for looking into this! I was able to RGIbwt for a few other similar datasets and although I got the same errors the runs were completed and produced results. I would rather stick with the kma aligner over bowtie2 as it's better for metagenomic data.
I am also trying different pre-processing steps (trimming, filtering, removing duplicate reads, etc.), so if any of these result in the error messages going away I will update this thread too.
@JennKnapp Ok, I think I might have found why we are getting the [W::sam_parse1] mapped query cannot have zero coordinate; treated as unmapped
. When aligning with KMA
the RNAME is not set to the reference name just the *
. My next test will pull only 4 reads (2 with RNAME and 2 without) and I will also post an issue with both samtools
and KMA
developers. Cheers.
Issue is stale and will be closed in 7 days unless there is new activity
@JennKnapp see https://bitbucket.org/genomicepidemiology/kma/issues/86/w-sam_parse1-mapped-query-cannot-have-zero
This link is unavailable: This issue is submitted and being reviewed.
Hi @raphenya , is there any updates or fix on this? Cause I'm having the same issues as well. Thanks!
I'm afraid we have had no updates @jihen-lau.
@agmcarthur Thank you for your response! I'm wondering if these warnings have any significance. Could our output file still be reliable despite these warnings?
[W::sam_parse1] mapped query cannot have zero coordinate; treated as unmapped and "WARNING :model with id : 128, has few mapped reads to make consensus sequence skipping : ARO:ID"
@jihen-lau that warning is benign, it just means that particular reference sequence had so few reads mapped to it that a consensus allele could be generated, i.e. that allele is very likely not present in your data.
@agmcarthur Thanks for clarification! In such case, can I take the multiple lines of "[W::sam_parse1] mapped query cannot have zero coordinate; treated as unmapped" as a benign warning as well?
Yes indeed!
@agmcarthur once again thanks for the clarification and the tools!
I am running into the exact same issue as previously posted and closed without comment, is there a fix for this, or any troubleshooting suggestions? command: rgi bwt --local --read_one R1.fastq.gz --read_two R2.fastq.gz --output_file ~/card_output/sample_name
Many many lines of: [W::sam_parse1] mapped query cannot have zero coordinate; treated as unmapped
Then it eventually shows: "merging from 0 files and 16 in-memory blocks" and then many lines of: "WARNING :model with id : 128, has few mapped reads to make consensus sequence skipping : ARO:ID"
the process is then terminated.