huishenlab / biscuit

BISulfite-seq CUI Toolkit
Other
16 stars 7 forks source link

NOMEseq epiread output #12

Closed xxxmichixxx closed 2 years ago

xxxmichixxx commented 3 years ago

Hi, thanks for this great tool. Could you explain in a bit more detail the NOMEseq epiread output? Specifically, I would like to know exactly which GpCs I have to extract from the genome sequence following the first GpC, to match the methylation info to the genomic coordinate. Thank you!

zwdzwd commented 3 years ago

I assume you mean the GpC that's the closest to a CpG when you refer to "match". You would need the CpG and GpC coordinates of the genome. You can get it from the genome fasta file. One solution I can think of is by extracting the genome sequence starting from the first CpG/GpC, then count all the CpG and GpC from that first occurrence and compare the two sets. Sorry that doesn't sound too easy I can't think of a better solution to do it on a per-read basis.