Closed hagenaue closed 3 years ago
For background on the statistical output (Regression, Chi-Square): https://onlinestatbook.com/2/index.html
Guidelines for writing up statistics: https://www.statisticshowto.com/probability-and-statistics/reporting-statistics-apa-style/
Regression: B=Regression Slope, p=p-value.
Making prettier covariate vs. diagnosis plots (for numeric co-variates: age, pH, PMI, RNAdegradation, RIN):
pdf("PutPlotnamehere.pdf", width=7, height=7)#size can be messed around with - the smaller the better! but legible
boxplot(NumericCovariate~GroupingVariable_Factor, ylab="NameOfNumericCovariate", col=c("darkorange2", "burlywood1", "red", "pink")) #number of colors is number of groups or skip altogether
stripchart(NumericCovariate~GroupingVariable_Factor, vertical = TRUE, method = "jitter", add = TRUE, pch = c(20, 1, 17, 2), cex=2, cex.axis=10, cex.lab=10, col = 'black') #Adds jittered datapoint to box plot
dev.off()
For categorical co-variates vs. diagnosis - make tables. Can be done by outputting a .csv and working in Excel ... or done in R.
Make a meta-figure with subpanels for each diagnosis*covariate relationship (like Ohayon).
... and then write a figure legend with the APA formatted stats.
Re-output statistics for subject variable vs. subject variable for the Maycox dataset after removing the outlier samples (So extreme it drives PC3: GSM439786.CEL.gz, Missing pH Data and has most extreme PC1: GSM439795.CEL.gz).
From sample code: Lns 455-479 define "bad samples", remove them from the gene expression data and metadata (and derived factors, etc). This code overwrites other objects in your workspace, so make sure to save your workspace under a new name (e.g. wOutliersRemoved) and then change the output directory for your output so you don't overwrite it.
Since it reuses the same objects, you should be able to just re-run all of the subject variable vs. subject variable and PCA vs. subject variable code.
Write up a brief description of which important co-variates may be confounding variables (pH, RNA degradation, PMI, Age, Gender, RIN, batch/scan date) in the Maycox (and eventually Iwamoto) dataset, with figures/tables to illustrate the most interesting relationships and statistical summary.