Closed katwre closed 4 years ago
Dear @katwre,
@sorjuela is away for a couple days, so I'll give a brief answer. I think 'no', because under the hood we use methtuple
from @PeteHaitch and on the main page of his repo, it says "Currently, the SAM/BAM file must have been created with Bismark." But, to be honest, I don't know what the differences are in what gets into the BAM; maybe a simple conversion can be done.
Best, Mark
methtuple parses the XM
-tag of Bismark. From memory, bwa-meth doesn't produce this tag, so that's why it's limited to Bismark
@markrobinsonuzh and @PeteHaitch , Thank you very much for your quick reply!
methtuple
-based approach looks great, but I think for now I will give a try with SNP-based one.
So then I have a question, if I go for using a bam file + output from BisSNP/biscuit then "MD" and "XG" tags in the alignment could be sufficient? I am asking because it looks like in DAMEfinder code "MD","XM","XR","XG"
tags are extracted from a BAM file (e.g. like here https://github.com/markrobinsonuzh/DAMEfinder/blob/87f43acb506131eadfc289b054f3813bc46fc74f/R/split_bams.R#L90), but actually only MD (mismatching positions/bases) and XG (genome conversion state for the alignment, it's CT or GA from what I've just read) are used and in my bwa-meth alignment I can see MD tag and YC tag that is always either GA or CT...
ST-E00351:144:HF3:6:2219:28412:67164 163 chr1 9990 0 8M2D108M = 9990 118 TATCATTCATAACCCTAACCCTA
ACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACC AAAFFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ
JJJJJJJJJJJJAJJJJJFJFJJAJJFJJJJJJJJJJAJJJFJJJAJ<A-<AJF--A-AAF7AFF-AAJA MC:Z:8M2D108M YC:Z:GA MD:Z:1G2T0G2^CG108 YD:Z:r PG:Z:MarkDupl
icates RG:Z:XXX-RUNID-0144-FLOWCELL-AHF3-LANE-6__val_ NM:i:5 AS:i:108 XS:i:107
Hi @katwre, thanks for your interest in the package!
You’re right, for the SNP-mode I actually only use the MD and XG tags. The problem here is that bismark has a specific type of MD tag, which could look like this for example: 2C0C3C7C2C7C1C0C1C1C1C2C2C0C1C2C2C0C6C7C0C0C0C0C0C4C3C1C0C7C5C0C1C5C1C5C1C8
, unlike a typical MD tag that looks more like the one in your bwa bam file. Looking at your read makes me think that maybe we could combine the MD with that MC tag, but would have to check what that is next week. My current MD parser only works for the bismark MD, sorry.
Best, Stephany
Hi @sorjuela ,
Thank you for your response! If you would have any idea how to use MD or combination of MC and MC tags or how actually Bismark creates MD tags that would be very useful!! (e.g. do MD tags have only Gs or Cs in their string? are MD tags calculated from CIGAR strings?) Should I use samtools calmd
to recreate MD tags in order to mimic bismark-like MD tags? I tried already few tools for differential allelicly methylated regions calling that use SNP information and unfortunately they don't really work for me, so that's why I would be open to help make your tool work with bwa-meth
Cheers,
Katarzyna
Several things:
I have rather big files .fast.gz files (paired-end data, around 45G for each fastq.gz corresponding to a mate), it's cancer patient-derived data and it looks like bwa-meth performs better on this data than bismark
Thank you a lot for explanation of your implementation and having a look at read tags information! Since it doesn't look that there is a straight-forward and fast solution, then probably I will just give up with it for now. I call methylation using methylDacker (https://github.com/dpryan79/MethylDackel) for reads aligned using bwa-meth
I tried biscuit (https://github.com/zwdzwd/biscuit) and https://github.com/astatham/snappymeth
I think this issue can be closed then Thanks, Katarzyna
Hi @sorjuela , great package! I was looking for a differential ASM detection tool like yours, however I have BAM files generated by bwa-meth and I do not want to use bismark, is there any possibility to use bwa-meth output as input of your tool? Best, Katarzyna