Closed nchernia closed 6 years ago
I don't think so, but I think the XM tag could be added. It would be nice not to have to hit the reference index for methylation extraction...
@brentp
I'm working on a tool that needs to identity methylation on a per-read basis. Performance is poor because I have to to a faidx lookup for every read to determine methylation state. What do you think about adding the XM tag like (bismark)? I could take a pass at if you think it would be a useful addition.
when I have to do something like this. I make sure the reads are sorted by chromosome, then, the first time a new chrom is seen, I read that fasta for that chrom into memory. so that's e.g. 250 MB for human chr1. then use it as a string for lookup. you can also use pyfaidx with a large lookahead value.
I'd prefer to keep this out of bwa-meth. I think Methyldackel can do this anyway, no?
MethylDackel just added this feature. It’s a function called “perRead”
sweet! let's close this then. bwa-meth does a good job for what it does but I want to keep that fairly atomic and let tools like methyl-dackel do the downstream stuff.
@nchernia : Nice! looks like perRead does what I want. Never would have thought to look at that branch without your prompting - Thanks!
Thanks for the tool. One question - is there an easy way to detect from the read itself whether or not it is methylated? MethylDackel calculates on a per-cytosine basis but I'm also looking for essentially the inverse. Bismark does this with the XM tag.