Closed nash5202 closed 1 year ago
I am not sure. You were able to run bwameth.py index
without problems?
Do other tools (fastqc, etc) run without problems on your fastq files?
I am having the same error when converting the sam files generated by alignment with bwa-meth to bam files:
[W::sam_parse1] urecognized reference name; treated as unmapped
I tried two variations of the command to align a single-end read and convert sam to bam:
bwameth.py --threads 8 --reference ref_genome.fa sample.fastq > sample.sam samtools view -b -o sample.bam sample.sam
and
bwameth.py --reference ref_genome.fa sample.fastq -t 8 | samtools view -b - > sample.bam
I was able to runbwameth.py index
without any issues. I have tried the above commands on a few different fastq files and I am currently double-checking my fastq files with fastqc for any issues.
The environment I am using had the following package versions installed:
bwa 0.7.17
bwa-mem2 2.2.1
bwameth 0.2.5
python 3.11.0
samtools 1.6
toolshed 0.4.6
Please let me know if you have any advice on how to troubleshoot this error message.
Thank you.
One problem I noticed is bwameth.py
puts nothing in the chrom
column when the read is unmapped, while it should put an *
there:
readname\t77\t\t0\t0\t*\t*\t0\t0\tGNAATCATGTGTCTTCTTATCTCTAAATCAGAATCCCGCCCAAACGAAACGATACGACAACGCCGCGAAACCTCGATTAACCTCAAATAACCAATCCCCCACCGATCCCCGCCGCCGAACCCCCCGCGCCAGCCCGCGCCCCGCGCGGCCG\t;#CCCCCCC;CCCCCCCCCCCCCCCCCCCCCCCCCC;CCCCCC-CCCCCCC;CCCCCCCCCCCCCCCCCCCCCCC;CC--CCC;CC;CCCC;C;C;;;CC-CC--C;-CC--C;-CC--C---;--C--C--;C--;---CC;CCC-----\tAS:i:0\tXS:i:0\tRG:Z:test\tYC:Z:CT\n
. This may cause some software to report Unrecognized reference name
.
@haodongchen , thanks for diagnosing. I pushed a fix for this, would you or others in this issue give it a try and let me know? thanks!
@brentp, I can confirm that the blank chrom
column is causing the issue because I did not get this error when I removed reads that were blank in the chrom
column.
I pulled your most recent update and ran alignment with the updated bwameth.py.
bwameth.py --reference hg19.p13.plusMT.no_alt_analysis_set.fa.gz SRR536237.fastq > SRR536237.sam
The resulting sam file still contained blank chrom
columns for some reads:
SRR536237.29 16 chr17 75537577 60 101M * 0 0ACTACCCCGAATAAACCACACTCCTTACAAAAACCAAACAACTACGTTAAAAAAATATTAATATTTATCAAAAAACCCTCTTCCAACCATTTTTAATTTTT #########A>3><3<>A>=953>;;=?????>7;A@A@;?@;.<DDDDDEECCB@B<DDECECDB?16DDIEE??3A+3:FAIEE@>DDDDADDDD???? NM:i:1 MD:Z:76A24 AS:i:98XS:i:29 RG:Z:SRR536237 YC:Z:CT YD:Z:r
SRR536237.30 4 0 0 * * 0 0 TTGTTGTTTGGAGATGTTTTGGTTTTGTGGTTTTAAGGCTTTGGAGAAGGGAGGGGAAAATATGTGTTTTTTTTTTGAATTAGGGTTATTAAAGTTAATTT ????8:ADD>?+2++2AEEDD<<C;FBEEI?8))*:*?*09DBB######################################################### AS:i:0 XS:i:0 RG:Z:SRR536237 YC:Z:CT
When I ran samtools view -S -b SRR536237.sam > SRR536237.bam
, it returned the same error.
[W::sam_parse1] urecognized reference name; treated as unmapped
Please let me know if any other information would be helpful. Thank you.
thanks for following up @MSleeper1 . Can you share a fastq with 2 reads that show the problem? I think this is likely something with not having paired-end reads as there are likely few users with single-end reads.
I tried the fix and it solved the problem I got.
@MSleeper1 and @haodongchen thanks for following up! I'll tag a new release with the fix.
I dug into which script was being used when I called bwameth.py
and found that has been defaulting to using miniconda3/envs/bwa/bin/bwameth.py
.
When I ran the alignment and specified the absolute path miniconda3/pkgs/bwameth-0.2.5-pyh5e36f6f_0/python-scripts/bwameth.py
, which contains your most recent push. This solved my problem; there are now *
in the chrom
columns that were previously blank.
Thank you for all the help. :)
I have some Enzymatic methyl sequencing data that I am trying to align using bwa-meth. However, I am encountering the following issue in parsing of the sam file that the aligner is generating when I try to convert the sam file to a bam file.
[W::sam_parse1] unrecognized reference name ""; treated as unmapped
I am not sure what the issue is as I have used the same hg38 reference genome for alignment of other types of sequencing data. Also, I am using the following command for the task:
bwameth.py --reference ref_genome.fa Sample1_1.fastq.gz Sample1_2.fastq.gz -t 7 | samtools view -b - > sample1.bam
Any help is appreciated.
Thank you.