nargolg1 / PhD-Shelter-Shrub-Climate

1 stars 0 forks source link

Shelter effects on desert animals: stats feedback #4

Open zenrabbit opened 2 months ago

zenrabbit commented 2 months ago

global models = really big picture

  1. yes, reuse the opens for rii.

  2. in boxplots, put the means as triangle and drop geom_points

  3. in models, I thought we discussed abundance ~ mean_temp + microsite %in% site_code + (1 | year) did you try nested microsite model with year as random effect? how are going to handle season in models

  4. also a lot of models, what about a PCA On three microclimate measures, check the percent variation explained, the use PC1 for instance?

  5. n your radiation stats, you tested humidity instead...

  6. Start with 'did animals associate with shelters more/same as shrubs and different from open'?

  7. The 'weeds' of this is 'why did animals co-occur in some places more?' but did they - show that first. then worry about why.

  8. Can you pls add hypothesis and predictions assumptions first in file, then test each one.

  9. for radiation stats only, filter out 0 = night and use only daytime

zenrabbit commented 2 months ago

Test winter separately, non-orthogonal data.
This removes need for season as variable in main models, ie 2022 and 2023 both have spring data, so no need for season var in those models.

test high-level animal patterns first, with and without PC1 (if significant % var explained) as covariate

then do species PCOA for 2022-23 (springs only)

then see what you 'might' have to explain - ie are there more animals in any measure anywhere that is ecologically relevant? IF so, then you dig into detailed mclim mechanism stats as needed.

zenrabbit commented 2 months ago

PCA Qs

the mean in intensity includes all the 0s overnight?
when you drop_na for humidity you lose 10 full row
did you try hourly, drop_na on all? and see what you still have? also filter out intensity > 0 then do means OR just run PCA on those data instead, then take a mean after that?

and what the cut off for including on PC1?

his is really more of an extended comment than an answer. First of all, it's not at all unusual for the first component in a PCA to capture a large percentage of the variance. Without rotation, this is a very common occurrence.That said, your question is concerned with one aspect of the multitude of subjective decisions required in producing a PCA, any PCA. There are many possible methods and rules for selecting components, here are a few. Your preference for using component 'interpretability' can be considered a sanctioned one as it prioritizes human judgment over machine curation and decision-making. A second approach is to use a graphical heuristic, the so-called scree plot, which involves visually identifying a cut-off in a plot of eigenvalues and retaining those components falling before that threshold.More machine or statistically curated rules are also used. For instance, retain those components with eigenvalues greater than 1.0, thereby ensuring that each component contributes at least as much as a single input variable. This rule is probably the most commonly used default on many software packages.

zenrabbit commented 2 months ago

summary data only PC1 loadings all less than 1. Can you pls find a few papers on 'how to' decide PCA and let us know?

data.pca.summarized$loadings[, 1:2]

Comp.1 Comp.2

temp 0.707731543 0.2959432

intensity -0.003123336 -0.9067140

humidity -0.706474563 0.3004784

zenrabbit commented 2 months ago

An eigenvalue > 1 indicates that PCs account for more variance than accounted by one of the original variables in standardized data. This is commonly used as a cutoff point for which PCs are retained. This holds true only when the data are standardized.

zenrabbit commented 2 months ago

2022-23 spring-spring only

Unknown

nargolg1 commented 2 months ago

https://htmlpreview.github.io/?https://github.com/nargolg1/PhD-Shelter-Shrub-Climate/blob/main/shelters/global%20model%20files/global-models.html

Updated RMD.