wvictor14 / team_Methylation-Badassays

STAT 540 Spring 2017 team repository
0 stars 2 forks source link

Feedback on initial group proposal #2

Closed farnushfarhadi closed 7 years ago

farnushfarhadi commented 7 years ago

Here is the group information you provided:

Name Department/Program Experties/Interests GitHub ID
Victor Yuan Genome Science and Technology Placental Epigenetics @wvictor14
Michael Yuen Medical Genetics Cancer Genomics @myuen89
Nivretta Thatra Bioinformatics Neuroscience @nivretta
Ming Wan Statistics Statistical Genetics and Machine Learning @MingWan10
Anni Zhang Genome Science and Technology Pancreatic Cancer/Cancer metabolism @annizubc

Team name: Methylation Badassays

DNA methylation (DNAm) - the covalent modification of DNA at CpG sites resulting in attached methyl groups - regulates gene expression. A large portion of DNAm variability is associated with genetic ancestry** and is heritable [1]. These marks vary across cell types, temporal development, and can change due to environmental stimuli. These factors are all taken into account when researchers are trying to use gene expression analysis when studying diseases. Differentially methylated CpG sites associated with pathology can be confounded by CpGs associated with genetic ancestry causing spurious results. Therefore, genetic ancestry, as a covariate, needs to be accounted for in any epigenome-wide association study (EWAS). An unpublished set of placental samples (33 Caucasians; 12 Asians) will be used to identify a set of top-ranked CpG sites that associate with self-reported ethnicity. By identifying the CpG sites which are specifically associated with Asian and Caucasian ethnicities in placenta tissues, we propose to reduce false positive CpG site discoveries when looking for associations with placental pathologies [2] by accounting for genetic ancestry -associated CpGs in a published dataset lacking reliable ethnicity information.

**Note: It is important to distinguish between genetic ancestry, race and ethnicity. The latter two are social constructs and have no genetic definition. In contrast, genetic ancestry is a continuum which describes the architecture of genome variation between populations [3].

  1. Fraser HB, Lam LL, Neumann SM, Kobor MS. Population-specificity of human DNA methylation. Genome Biol. 2012;13(2):R8.
  2. Sparks PJ. Do biological, sociodemographic, and behavioral characteristics explain racial/ethnic disparities in preterm births? Soc Sci Med. 2009;68(9):1667-1675.
  3. Yudell M, Roberts D, DeSalle R, Tishkoff S. SCIENCE AND SOCIETY. taking race out of human genetics. Science. 2016;351(6273):564-565.
farnushfarhadi commented 7 years ago

Hi team @STAT540-UBC/team-badassays

@rbalshaw and I are assigned to your project to help with it. If you have any questions or you want to talk to us, open an issue and tag us to communicate. I would like to meet some of your team members and talk about your project at seminar time and then write the feedback to you. See you tomorrow!

farnushfarhadi commented 7 years ago

Hi team @STAT540-UBC/team-badassays

Nice group composition! Thank you for writing up the initial proposal. Here is what came to my mind for your initial proposal and what you could think of:

DNA methylation (DNAm) - the covalent modification of DNA at CpG sites resulting in attached methyl groups - regulates gene expression. A large portion of DNAm variability is associated with genetic ancestry** and is heritable [1]. These marks vary across cell types, temporal development, and can change due to environmental stimuli. These factors are all taken into account when researchers are trying to use gene expression analysis when studying diseases. Differentially methylated CpG sites associated with pathology can be confounded by CpGs associated with genetic ancestry causing spurious results. Therefore, genetic ancestry, as a covariate, needs to be accounted for in any epigenome-wide association study (EWAS).

An unpublished set of placental samples (33 Caucasians; 12 Asians) will be used to identify a set of top-ranked CpG sites that associate with self-reported ethnicity.

By identifying the CpG sites which are specifically associated with Asian and Caucasian ethnicities in placenta tissues, we propose to reduce false positive CpG site discoveries when looking for associations with placental pathologies [2] by accounting for genetic ancestry -associated CpGs in a published dataset lacking reliable ethnicity information.


Please remember to provide a table for division of labour (e.g. literature research, data cleaning, QC analysis, statistical analysis, results validating, analyzing results (and anything else could be added to your project)).

Looking forward to seeing your results! Good luck! :)

@rbalshaw

rbalshaw commented 7 years ago

I don't have much feedback to offer over what @farnushfarhadi has provided.

While developing and validating a methylation-based assessment of genetic ancestry in some way would be interesting on its own merit, you might find it even more fun to demonstrate that some other analysis where genetic ancestry was poorly measured (perhaps using self-report, if at all) could be improved by incorporation of your methylation-based assessment of ancestry.

It's a challenging area, but one that has important implications. For example, I've been involved with drug-development efforts where a calculated score associated with "continent of origin" was used in order to control for/adjust for 'genetic ancestry'. The FDA required it of that project, and it was fascinating to work with the data -- SNP chip data, in that case.

Let me know if you'd like to chat further. I'll try to monitor this spot, too.

farnushfarhadi commented 7 years ago

@STAT540-UBC/team-badassays Hi team

How is everything going with you final proposal? Please make sure you go to seminar tomorrow. I will not be there but I have asked Amrit and Santina to take care of your questions. Feel free to discuss with them about your question/methodology or any confusion/problem you have.

Good Luck. :)