ScienceParkStudyGroup / studyGroup

Gather together a group to skill-share, co-work, and create community
https://www.scienceparkstudygroup.info
Other
6 stars 12 forks source link

Gene Regulatory Network Identification from Counts #32

Closed Fred-White94 closed 5 years ago

Fred-White94 commented 6 years ago

Hi all,

My current problem is trying to identify potential gene regulatory networks involving transposable elements (TEs) using 10 samples and rnaseq count data. Currently I have performed a correlational analysis on elements that are within 50kb of a gene. This is good as a preliminary analysis however due to the small sample size (n=10) this is not a very robust way of looking at the data. My current dataset contains all loci, the distance between a TE and a gene (if within 50kb) and correlation coefiicients as well as p values. I also have raw count data.

What's an intelligent way to proceed to potentially identify some TEs and their related genes? If clarification is needed then let me know!

Cheers

Fred-White94 commented 6 years ago

Now:

After further filtering of the data it is down to ~6000 correlations of which 2000 have a pval <0.05. Furthermore - I have bootstrapped the correlation providing a different correlation coefficient for each comparison as well as standard errors. The standard errors produced are reasonably low but I'm not convinced this is the best way of tackling it/using bootstrapping.

Is there a potential solution bootstrapping the genes for example using some kind of overall profile/score for each sample and assessing the difference in the changes made to this overall profile score when bootstrapping?

Thanks