Closed stephenturner closed 3 years ago
False start earlier, better now.
@stephenturner this is great.
i ran through the code. pretty sure i track what its doing. made one minor change to the submission script (removing the plausibility check and replacing it with a inline comment)
last thing before i merge ... i want to build the pkg and test in our automation pipeline. if that generates a submission and params file i'll go ahead and merge this and close #48
also just noting here (because i didnt see it at first in the rendered tibble in your comment) ... this param output includes the AIC, BIC, etc from each model ... so we can theoretically retrospectively try different params and compare / better understand the criteria by which the model was selected.
also just noting here (because i didnt see it at first in the rendered tibble in your comment) ... this param output includes the AIC, BIC, etc from each model ... so we can theoretically retrospectively try different params and compare / better understand the criteria by which the model was selected.
I went back and forth on whether to include output from tidy() - that's what brings in coefficients and term p-values. glance() adds the AIC, BIC, etc.. If we remove the results from tidy() we'll have one row per location.
hmmm ok. i dont think its a big deal to parse with distinct()
? and i do like having that model output for posterity. so (at least for now) my opinion is to leave in
cool. just forced the automation pipeline to run. workflow executed fine and submission and param csv files were written to S3 as expected.
merging
extract_arima_params()
to extract ARIMA parameters, tidy() and glance() stats from an ARIMA mable.In addition to extracting p,d,q parameters I'm also grabbing the results from tidy() and glance() on the model object. You'll note that we have more rows than we have locations. In the case where p>1, tidy() returns >1 row for each AR term. For example, see location=13. To get one row per location, remove the term:p.value columns and %>% to dplyr::distinct().