SpaCE-Lab-MSU / warmXtrophic

This repository contains R scripts that organize, analyze, and plot data from the long term Warming X Trophic Interactions experiment at Kellogg Biological Station and University of Michigan Biological Station (UMBS).
Other
0 stars 0 forks source link

warmXtrophic stats: greenup #10

Closed dobsonk2 closed 1 year ago

dobsonk2 commented 3 years ago

Hi Phoebe,

Here's a link to a markdown file for the greenup stats: https://github.com/SpaCE-Lab-MSU/warmXtrophic/blob/master/scripts/plant_comp_scripts/greenup_analyses.pdf

The first few pages go over what we've already discussed, and on page 5 I try a poisson distribution, but it doesn't seem to fit very well. I still include a few glmer models with poisson, but on page 9 I attempt a Friedman's test. I get an error, which is shown in full on page 9, and I can't figure out how to get past it. I've tried removing NA's from the data but that doesn't seem to help either.

After that, I include data for UMBS which we haven't looked at yet. When I look at separate histograms for ambient and warmed, on page 12, they look pretty good! However, their shapiro-wilk test is still <0.05. I also do a log transformation that looks good, but the shapiro-wilk is <0.05 overall and for warmed and ambient separately. I have yet to try other distributions on this data.

I'm currently at a standstill with the Friedman's test for KBS due to that error. I troubleshooted a bit to figure out what was wrong but I haven't found anything yet. Also, just a general question: how important is it for shapiro-wilk to be >0.05?

Thanks!

plzmsu commented 3 years ago

HI Kara: how important is it for shapiro-wilk to be >0.05? I think the answer to this is that a slightly non-normal dataset is not going to be too problematic but if it's really not normal then it's an issue. If no transformations or use of other statistical distributions seems to work, then you should look at outliers and skewness (kurtosis). So, are there any outliers that are justified in removing? My guess is that what is causing the issue is the peak around 225. Also, it's not clear to me if each point on the first histograms is a plot-year datapoint? Remember that the residuals only make sense once they are interpreted in the context of the explanatory variable(s). So the histograms in the beginning on raw data are useful in getting an initial sense, but maybe it's just year-to-year variation that's explaining that peak. What are the residual plots from? What equation? In this exploration, it's going to be helpful to print out the R code itself, not just the plots.
I haven't had time to run this code so maybe we can go over it in our mtg today with screen share.

plzmsu commented 3 years ago

Notes on format in .Rmd and PDF: "authors" not showing up at top of PDF. Eventually figure out a way to "tidy" up the headers in all CAPS so that they are in 2 columns and more legible

plzmsu commented 3 years ago

@dobsonk2 update "REQUIRES" at top of this script to specify what needs to be run ahead of this script. It relies on "plant_comp_clean_L0.R" In "plant_comp_clean_L0.R" update to remove first column in "greenup" which is blank before: write.csv(final, file="L1/greenup/final_greenup_L1.csv") then "greenup" will be read in without this extra column. Omit the edit I made in https://github.com/SpaCE-Lab-MSU/warmXtrophic/blob/master/scripts/plant_comp_scripts/greenup_analyses.Rmd to handle this after the fact.

Related to this, double check that the workflow diagram in the README for the repo is reflecting these steps. One confusing aspect is that "greenup" is really phenology but is included in plant comp - making that clear on the workflow would be helpful.