molecularsets / moses

Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models
https://arxiv.org/abs/1811.12823
MIT License
821 stars 241 forks source link

Aging, Epigenetic Drift and Gene Expression. #90

Open AlexanderMath opened 3 years ago

AlexanderMath commented 3 years ago

Sinclair suggests aging is caused by loss of epigenetic information [1]. The MOSES dataset contains chemical information for small molecules, but no information on how small molecules alter the epigenetics of human cells. This information is available in the CMap L1000 dataset [2]. In particular, the CMap dataset describe how gene expression (and thus indirectly epigenetics) change as a result of treating human cells with molecules.

Question. If you want to develop drugs that intervene in the aging process, and believe epigenetics and aging are related, why not include epigenetic information through gene expression? Are you already doing this internally?

I'd be happy to write a script that attempts to combine the two datasets and retrains some of the baselines you have. But I first want to be sure there's no obvious reason for not using the gene expression data.

[1] https://genetics.med.harvard.edu/sinclair/research.php [2] https://clue.io/GEO-guide