Closed kdahlquist closed 6 years ago
Data was compiled to perform a multiple regression analysis for db1 in the following spreadsheet: https://github.com/kdahlquist/DahlquistLab/blob/master/data/15-gene_networks_analysis/db1_mse-multiple-regression_BK20170220.xlsx.
A brief, preliminary multiple regression analysis for performed. The results of this analysis have been uploaded so that we may discuss them next Thursday: https://github.com/kdahlquist/DahlquistLab/blob/master/documents/db1-Multiple-Regression_BK20170220.pptx.
Next task for @bklein7 will be to do the multiple regression on other networks. db2 and db3 should be next because they are smaller/larger versions of each other and it will be interesting to see if they end up being similar or different in the analysis.
Actually, db2 and db3 look pretty different in the histograms, so it will be really interesting to see the analysis.
This week, I compiled data in preparation for completing multiple regression analyses for db2-db6. Unfortunately, I could not complete the analyses for db2 and db3, as only Gephi data for the old 15-gene dCIN5 network is available on the Dahlquist Lab repository. When possible, I would like an update from @maggie-oneil or @khorstmann regarding whether Gephi stats for the new dCIN5 networks (db2 & db3) are available and, if so, where I can find them.
I also briefly began a multiple regression analysis for db4. Thus far, it appears as though the only significant predictor of average MSE in this network is the corrected B&H p-values (within strain ANOVA). However, this correlation is negative, which is contrary to our expectation and quite possibly artifactual.
The db2 and db3 raw data can be found in the data repository for the dahlquistlab page. db2 here -https://github.com/kdahlquist/DahlquistLab/blob/master/data/15-gene_networks_analysis/Gephi_raw_db2_14-gene_25-edges_NW_dCIN5_fam_Sigmoid_estimation_output.csv db3 here - https://github.com/kdahlquist/DahlquistLab/blob/master/data/15-gene_networks_analysis/Gephi_raw_db3_17-genes_32-edges_NW-dCIN5-fam_Sigmoid_estimation_output%20.csv
This week, multiple regression analyses were performed for db2-db6. The excel spreadsheets containing the data used for these analyses can be found in this folder: https://github.com/kdahlquist/DahlquistLab/tree/master/data/15-gene_networks_analysis.
The compiled results of the multiple regression analyses conducted for db1-db6 were uploaded here: https://github.com/kdahlquist/DahlquistLab/blob/master/data/15-gene_networks_analysis/db1-db6_Multiple-Regression_BK20170329.pptx.
To summarize discussion at the March 30 meeting, @bklein7 has run a multiple regression in SPSS on db1-6 comparing MSE, P, d, b, ANOVA p value, in-degree, out-degree, betweenness centrality, closeness centrality, eigen centrality, eccentricity.
He needs to double-check to confirm that Gephi stats were from unweighted or weighted networks.
@kdahlquist noted that the strongest relationship (across 3 networks) was for the degradation rate, something that we actually provided to the model and didn't estimate.
She also noted the following (same list as task issue #343:
After reviewing the Gephi worksheets used in the last round of multiple regression analyses, I noted that weighted statistics were present (e.g. weighted in-degree) but not used. Instead, the statistics that were not labeled as "weighted" were used. Nonetheless, this leads me to believe that the Gephi statistics were derived from the weighted networks. I will confirm this during lab meeting today.
I have begun compiling data for the next round of multiple regression analyses and will perform these tests next week.
Open tasks:
The protocol for performing multiple regression analyses in SPSS (in the context of GRNmap) can be found here: http://www.openwetware.org/wiki/Analyzing_GRNmap_Output_Workbooks_Using_Multiple_Regression_and_SPSS. I also added this link to the multiple regression analysis folder's ReadMe file within the Dahlquist Lab repository: https://github.com/kdahlquist/DahlquistLab/tree/master/data/Spring2017/15-gene_networks_analysis/multiple_regression_analysis.
During our final lab meeting, we agreed that the new multiple regression analyses for the three best and worst random networks would be performed next semester.
We will eventually come back to this when we re-run models with cleaned-up GRNmap code. However, for now, we will close it. It will be here to refer back to when needed again.
Creating a new issue for this, splitting it off from #315.