marbl / CHM13

The complete sequence of a human genome
Other
882 stars 96 forks source link

CpG methylation annotation for CHM13 #90

Closed guilhermesena1 closed 9 months ago

guilhermesena1 commented 9 months ago

Hello,

Thanks for all the work on this super important assembly and all the effort to keep annotation organized and easily accessible!

I work on methylation calling algorithms and was very interested in using the CpG methylation value estimates from this paper as a ground truth. However, I cannot find the CpG methylation values estimated from this study, neither in the supplementary material for the paper or this repository. The closest I found was HG002 data mapped to CHM13. While this is useful, I was particularly interested in contrasting sequencing error with methylation in my applications, and for that purpose we are using CHM13 cell lines that we expect to be exact matches to the references, thus requiring the CHM13-specific methylation values.

Is this annotation available somewhere public? (i.e. CpG methylation values across CHM13). Thank you very much in advance!

arangrhie commented 9 months ago

Hello @guilhermesena1,

There is one track from CHM13 under the folder you were looking for, chm13v2.0_CHM13_CpG_ont_guppy3.6.0_nanopolish0.13.2.bw.

Cheers, Arang

guilhermesena1 commented 9 months ago

Thank you so much for pointing to the appropriate file. This is extremely useful!

ebioman commented 6 months ago

Hi Sorry to chime here in and hijack this already closed one. I am a bit confused there since for e.g. version 1.1 the section with the link says:

Epigenetic profile

So when the links says HG002 5mC ...from ONT and HIFI, is it:

  1. HG002 raw data on T2T V1.1 with the HG002 Y chromosome + methylation prediction
  2. CHM13 raw data on T2T V1.1 with the HG002 Y chromosome + methylation prediction

Sorry if this is not 100% clear to me :)

arangrhie commented 6 months ago

Good catch, the directory contains both. I've added "CHM13" in the description. 5e6d035

ebioman commented 6 months ago

Hello Thanks that makes it more clear. We agree that e.g. :

  1. chm13v1.1_hg002XYv2.7_hg002_CpG_ont_guppy6.1.2.bw = hg002 sequencing data mapped on the assembly V1.1
  2. chm13v2.0_CHM13_CpG_ont_guppy3.6.0_nanopolish0.13.2.bw = CHM13 sequencing data mapped on the assembly V2.0

I am just double checking because I am looking for tracks that correspond to CHM13 raw data mapped onto V1.1 assembly and it seems the second item above is the only file that corresponds to that ?

arangrhie commented 6 months ago

Yes, that is correct.