TransBioInfoLab / coMethDMR

Detect Regions of Concurrent Differential Methylation
https://transbioinfolab.github.io/coMethDMR/
7 stars 2 forks source link

Switch from sesameData:: to the Hubs #14

Closed gabrielodom closed 2 years ago

gabrielodom commented 2 years ago

We need to update the package from using IlluminaHumanMethylationEPICanno.ilm10b2.hg19:: to using IlluminaHumanMethylationEPICanno.ilm10b4.hg19::. Tiago found some issues with the data in coMethDMR_data:: that are most likely fixed by updating the manifest package. See https://github.com/TransBioInfoLab/coMethDMR_data/issues.

gabrielodom commented 2 years ago

SesameData no longer supports HG19, so I'm having to re-write half of the package. See https://github.com/zwdzwd/sesame/issues/66#issuecomment-1078416677

gabrielodom commented 2 years ago

Basically, I didn't realise that OrderCpGsByLocation() called sesameDataGet() EVERY TIME it ran, so I'm going through all the functions and adding an argument for a pre-set manifest object (manifest_gr). The affected functions are: OrderCpGsByLocation(), CloseBySingleRegion(), CoMethSingleRegion(), GetCpGsInRegion(), lmmTest(), SplitCpGDFbyRegion(), WriteCloseByAllRegions(); lmmTestAllRegions() and CpGsInfoOneRegion() may have some spillover effects.

I think I need a branch for this.

gabrielodom commented 2 years ago

Started work: https://github.com/TransBioInfoLab/coMethDMR/commit/3ad47e95747819331894a9de08ddc6b12b5176f9

gabrielodom commented 2 years ago

Similar story for the GetCpGsInRegion() function. Man, no wonder this code was running so slowly. Thankfully, the only affected function is CpGsInfoOneRegion(), but we're theoretically calling that function 40k+ times.

gabrielodom commented 2 years ago

CpGsInfoAllRegions() now calls for the ImportSesameData() ONCE at the top, and not every time CpGsInfoOneRegion() is called. This is embarrasing.

gabrielodom commented 2 years ago

Fixed in #20