cgplab / Rockermeth

A tool for discovering Differentially Methylated Regions
MIT License
4 stars 2 forks source link

methylation beta values #8

Closed BioDataMiner closed 4 months ago

BioDataMiner commented 4 months ago

Why are the methylation beta values obtained from the link provided in the article (https://doi.org/10.5281/zenodo.2586588) different from those downloaded directly from TCGA? Has any processing been applied to them? For the BLCA sample TCGA-UY-A8OC-01A, the provided β-values from the dataset accessible via zenodo (DOI:10.5281/zenodo.2586588) are as follows: TCGA-UY-A8OC-01A cg00000029 11 cg00000108 NA cg00000109 NA cg00000165 50 cg00000236 94 cg00000289 75 cg00000292 64 cg00000321 51 cg00000363 41

The beta values from The Cancer Genome Atlas (TCGA), and they appear as follows: TCGA-UY-A8OC-01A cg00000029 0.127419219152331 cg00000108 0.933892570073041 cg00000109 0.874188644547318 cg00000165 0.491178278833678 cg00000236 0.942600538181307 cg00000289 0.775748142674935 cg00000292 0.650173816075245 cg00000321 0.510477809984418 cg00000363 0.40095181201422

romagnolid commented 4 months ago

No, we didn't process the data, just rounded to the second digit and then multiplied by 100. I checked the raw data that we downloaded in 2018 using the TCGABiolinks package and they match the beta values on our Zenodo repository. I guess they updated the TCGA database.

EDIT: Maybe this is related https://docs.gdc.cancer.gov/Data/Release_Notes/Data_Release_Notes/#data-release-380 "All methylation files that were produced with the SeSAMe pipeline was replaced with a new version."

EDIT2: They definetely changed the data: https://docs.gdc.cancer.gov/Data/Release_Notes/Data_Release_Notes/#data-release-320 "Methylation data produced from the SeSAMe pipeline is now available for all TCGA projects." "Files that originated from the methylation liftover pipeline are no longer supported and will no longer appear in the portal."

BioDataMiner commented 4 months ago

Thank you for your prompt and informative response; it clarifies my query perfectly.