Closed vivekbhr closed 5 years ago
Hi Vivek,
I believe that most downstream applications that use Bismark BAM files (e.g. bismark_methylation_extractor
, deduplicate_bismark
, SNPsplit
, reStrainingingOrder
etc.) make use of the XR
and XG` tag combination. The only way to make this work would for your 'custom' files would be to reconstitute these read/genome conversion tags I am afraid.
Just out of interest, how did you end up with those files, using a different aligner?
Yes, the data was mapped using BWA and we used a custom script to make the XM tag. Maybe I can add the other two tags too. Are they supposed to contain the bisulfite converted sequence of the reference and the read?
No, the combination of read conversion (XR
, can be CT
or GA
) and genome conversion (XG
, can be CT
or GA
) indicates which of the four bisulfite strands a read came from (OT
, OB
, CTOT
, CTOB
).
It sounds almost like it would be a good idea to use a tool that does this kind of processing natively, wait, I think there is one...! I am sure you have a good reason to start from BWA, and re-implement the things required to proceed with bisulfite-processing?
@vivekbhr If you want to extract methylation from a file without bismark specific tags, you could use MethylDackel
Thanks, Felix and Martin. Yes, we used a modified library prep so we wanted to do some modifications in the methylation tagging. I managed to add the XR and XG tags and bismark now outputs CpG/CHH.. files, but doesn't give the full output (plots etc..) as it apparantly doesn't recognize the extended CIGAR 'S'. I'll check what options I have.
As of version 0.22.0, Bismark supports local mode (option --local
) for both Bowtie 2 and HISAT2, and thus also the CIGAR operation -S
. See here: https://github.com/FelixKrueger/Bismark/releases.
Interesting.. It seems that the default bioconda installation doesn't install the latest release. I got bismark 0.20.0 instead of 0.22.1. I re-installed now. thanks
Ah, it seems we might have to change the conda recipe then... Let me know if you have any questions, but I don't think there is a reason why you would ever re-purpose a non-bisulfite aligner to do what you need doing...
Thanks, Felix and Martin. Yes, we used a modified library prep so we wanted to do some modifications in the methylation tagging. I managed to add the XR and XG tags and bismark now outputs CpG/CHH.. files, but doesn't give the full output (plots etc..) as it apparantly doesn't recognize the extended CIGAR 'S'. I'll check what options I have.
i also have some bam file form bwa, Can you tell me how you add the XR and XG tags?
The XR
and XG
tags indicate which the conversion state for the reads and genome. respectively. Not sure if is something that even applies to your experiment?
The
XR
andXG
tags indicate which the conversion state for the reads and genome. respectively. Not sure if is something that even applies to your experiment?
In fact, I want to generate XM tags for my BAM file for downstream analysis.
My existing BAM file was generated by BWA-Meth. Due to the large file size, I don't have time to re-align it right now, and I urgently need to use an analysis tool that requires the XM tag. This situation is causing me a lot of stress.For this reason, I found another tool that can generate XM tags, but it seems to generate them based on XR and XG tags
Bismark generates the XM tag based on the actually observed sequence, the equivalent extracted genomic sequence (which needs to have handled indels and softclipping appropriately at this point already), and the read conversion state:
https://github.com/FelixKrueger/Bismark/blob/37e2cad18621c2619a9e02d1a69fdfec1819ed23/bismark#L4772
I haven't got a clue whether this is available in bwa-meth output or not I am afraid.
Bismark generates the XM tag based on the actually observed sequence, the equivalent extracted genomic sequence (which needs to have handled indels and softclipping appropriately at this point already), and the read conversion state:
https://github.com/FelixKrueger/Bismark/blob/37e2cad18621c2619a9e02d1a69fdfec1819ed23/bismark#L4772
I haven't got a clue whether this is available in bwa-meth output or not I am afraid.
thank you very much, I will study your code carefully
Hi Felix
I have some custom tagged bam files where the first 14 tags are the same as bismark output, but the XR and XG tags are missing. Is it possible to extract the methylation calls using bismark methylation extractor without considering these two tags? I basically want the CpG/CHH.. bedgraphs from these bam files at the end. If something in the code needs to be changed, can you point me to what I should do so I can modify my local copy of bismark?
Thanks, Vivek