sr320 / ceabigr

Workshop on genomic data integration with a emphasis on epigenetic data (FHL 2022)
4 stars 2 forks source link

Run mixomics and two simple files #68

Closed sr320 closed 9 months ago

sr320 commented 2 years ago

Using gene expression https://github.com/epigeneticstoocean/2018_L18-adult-methylation/blob/main/analyses/gene_fpkm.csv

gene level methylation https://raw.githubusercontent.com/sr320/ceabigr/main/output/40-gene-methylaiton.csv

kubu4 commented 2 years ago

@sr320 - Do you know why the gene level methylation file has NAs present?

kubu4 commented 2 years ago

To add some background, I was messing around with the Mixomics package trying to run a PCA with the gene methylation file and the PCA function would just run indefinitely (whereas the gene expression file would load almost instantly). I was able to figure out that the difference between the two files was NAs in the gene methylation file. After replacing those NAs with 0, the PCA function worked.

sr320 commented 2 years ago

NAs are likely the result of having genes that have CpGs with no methylation data. Or for that matter it could be a result of taking a mean of all CpGs in a gene where at least one was a NA (not sure how NAs would be handled in that instance.)

kubu4 commented 2 years ago

Thanks. Interesting.

So, if we have no methylation data for a gene, does that mean an average methylation of 0 or does that mean had insufficient sequencing depth across the gene to reach the set threshold (e.g. < 5x coverage across entire gene)?

sr320 commented 2 years ago

Insufficient sequencing depth

On Wed, Aug 31, 2022 at 3:59 PM kubu4 @.***> wrote:

Thanks. Interesting.

So, if we have no methylation data for a gene, does that mean an average methylation of 0 or does that mean had insufficient sequencing depth across the gene to reach the set threshold (e.g. < 5x coverage across entire gene)?

— Reply to this email directly, view it on GitHub [github.com] https://urldefense.com/v3/__https://github.com/sr320/ceabigr/issues/68*issuecomment-1233362726__;Iw!!K-Hz7m0Vt54!jO3JTe8j8gJDyy_xjQAj_sqsFdG1ykp5YRLPoaFURO_-mEsuJbT9LMiAMBslNINLOAOpuEzG-UVu6aAh8a2sTg4$, or unsubscribe [github.com] https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ABB4PNY2JGKSAJPDPGE34WLV362STANCNFSM6AAAAAAQAS5ABY__;!!K-Hz7m0Vt54!jO3JTe8j8gJDyy_xjQAj_sqsFdG1ykp5YRLPoaFURO_-mEsuJbT9LMiAMBslNINLOAOpuEzG-UVu6aAh-8SI6ts$ . You are receiving this because you were mentioned.Message ID: @.***>

--

Steven B. Roberts, Associate Professor Associate Director - Graduate Program Coordinator School of Aquatic and Fishery Sciences University of Washington Fisheries Teaching and Research (FTR) Building - Office 232 1140 NE Boat Street - Seattle, WA 98105 robertslab.info - @.*** - @sr320 vm:206.866.5141 - cell:360.362.3626 schedule a zoom call: https://d.pr/gsgxVJ

kubu4 commented 2 years ago

I decided to mess around with this a bit, but not extensively so I don't have a dedicated project for this. Here's rendered R Markdown if anyone's interested:

https://rpubs.com/kubu4/cvir-gonad-oa-mixomics_testing

Basically, ran the following:

Average gene methylation

Both PCA and sPCA show clustering by sex, regardless of methylation status and/or treatment.

Average gene methylation and gene expression (FPKM)