weecology / MATSS-LDATS

Macroecological LDA analysis of time series
MIT License
3 stars 0 forks source link

Changepoints with ~time always find 2? #35

Closed diazrenata closed 5 years ago

diazrenata commented 5 years ago

This might be correct! But it seems odd.

All the changepoint models with formula = ~time find 2 changepoints (https://github.com/weecology/MATSS-LDATS/blob/add-bbs/analysis/reports/ts_report.md). I ran just Portal with formula = ~1 and got 4 changepoints.

@juniperlsimonis, do you have any tips on how to tell if there's something strange going on vs. this is the true result? Thanks!!

juniperlsimonis commented 5 years ago

whoa that is pretty weird! given the uniformity of that result, i wouldn't be surprised if something is a bit wonky under the hood. this is definitely where having the ability to mock up some data with known structures is going to help us track things like this down. as of right now, though (while I'm working on building that out), i don't have a quick and easy way to see if something strange is going on or if it's real result. i think it'll require digging in through the code pipeline to a couple of critical spots in the ts components. i'll put this in ldats and link back to this issue here. thanks for letting me know about it!!

diazrenata commented 5 years ago

I just ran some changepoint models with different covariates and formulas to see what happened.

I used just the Portal data and the same LDA model for all the changepoints. I created new covariate columns for month = month of the census and timestep = row number.

It looks like either formula = ~1 or ~month gives numbers other than 2 (5 and 4), but ~newmoonnumber or ~timestep gives 2.

EDIT - I just did another run, adding a normal noise covariate (rnorm(mean = 100, sd = 20)). That got 4 cpts, and the re-run of ~timestep gave 3. So looks like it's just the ~newmoon that gave 2. I'll work on generalizing this so it's easy to explore with other datasets, because I think some cross-dataset info would be useful at this point.

Maybe this is helpful? 🔮

https://github.com/weecology/MATSS-LDATS/blob/explore-cpt/analysis/reports/test_covariates_ts_report.md

PS if you want to explore the model objects they're in the drake cache on hipergator and on my computer right now, or you can run analysis/test_changepoint_pipeline.R.

juniperlsimonis commented 5 years ago

ooh, totally, thanks!

diazrenata commented 5 years ago

This has been resolved with updates to the pipeline & LDATS 0.2.0