nloyfer / wgbs_tools

tools for working with Bisulfite Sequencing data while preserving reads intrinsic dependencies
Other
125 stars 33 forks source link

genome "liftover" for pat format #46

Open NikolausMandlburgerCCRI opened 11 months ago

NikolausMandlburgerCCRI commented 11 months ago

Hi! Thank you for developing these great tools! I was wondering what would be the recommended way to convert pat files from one genome to another? As far as I am aware there is no designated function for this in wgbstools. I would be interested in this kind of functionality because I would like to transfer the cfDNA pat files from your 2023 nature paper from hg19 to hg38. Thank you in advance, all the best, Nikolaus

AndriesDeKoker commented 10 months ago

From the references folder of the wgbs_tools, I have merged the coordinates in the hg19 CpG.bed.gz, based on clusterID, with the hg19 pat-file. I used LiftOver on the coordinates to go from hg19 to hg38 coordinates. Merged based on the coordinates the hg38 CpG.bed.gz clusterIDs and selected the columns necessary for a pat-file, bgzipped to get the hg38 pat.gz

NikolausMandlburgerCCRI commented 10 months ago

Thank you very much!! Am I right that I have to subtract 1 from the start column of the CpG.bed.gz file (1 based) to work correctly with lift over which expects 0 based bed files?

AndriesDeKoker commented 10 months ago

Yes, indeed, I also did that!

NikolausMandlburgerCCRI commented 10 months ago

great, thanks a lot!!