NOAA-EDAB / ecodata

A data package for reporting on Northeast Continental Shelf ecosystem status and trends.
https://noaa-edab.github.io/ecodata/
Other
29 stars 12 forks source link

Create gls() function #17

Closed slarge closed 1 year ago

slarge commented 5 years ago

In addition to the ecodata::geom_gls(), it might be a good idea to generalize the gls() function such that summary(ecodata::gls(x, y, data)) will return the fit parameters. Further, geom_gls() function would have the ecodata::gls() function instead of having the gls code embedded (as it currently is written).

Or, should we create a simple new package for the gls() function (and maybe the glm, for count data, too)? Open to thoughts @seanhardison1 @andybeet @sgaichas @slucey

slarge commented 5 years ago

Created a new branch https://github.com/NOAA-EDAB/ecodata/tree/gls-function @seanhardison1 and @andybeet, kick the tyres? Should also still plot using geom_gls()

m <- 0.1
x <- 1:30
y <-  m*x + rnorm(30, sd = 0.35)

data <- data.frame(x = x,
                  y = y)
m1 <- ts_gls(data)

m1$ts_gls
[1] "linear_norm"
seanhardison1 commented 5 years ago

I'm on board with building out a function to produce fit summaries. Another thing we may want to consider is including a GLS model with AR2 error in the selection process. That may be a different issue though.

slarge commented 5 years ago

Should this package even have geom_gls (and newly proposed ts_gls and ts_glm)? Let's mull over keeping the data in ecodata and relocating the analytical functions into a proper analytical package. What if we need to update the ts_gls function between SOE, what would that look like with regards to the versioning scheme?

andybeet commented 5 years ago

So do we want to just version the data for the report or the entire report (including figures, trend lines etc) or both? I think the answer to this dictates how we structure everything.

If we moved the plotting functionality out of the package then we are left with data in one package and analytics in another. I think versioning the actual report might then be tricky since it will depend on the version of the data and the version of the analytics tool. And i am not sure how easy it would be to reconstruct historic reports from this structure.

slarge commented 5 years ago

So, we need to decide if we want a support package for the SOE (data + analytics) or a data package for ecosystem reporting (data) and a package for analyses (probably limited in scope to timeseries analyses and visualization for ecosystem reporting). I don't have skin in the game for either, but I want to be mindful of how we structure products before they become too mature and early decisions are baked in.

In support of one package: 1) Versioning the report only requires one package. 2) Easier report reconstruction (I guess this assumes that there are breaking changes in analyses over versions?)

In support of two packages: 1) Assume we develop more analytical functions that other groups interested in ecosystem reporting start to use (I hope we do). Would they be expected to install the entire data package for a the analytical functions? 2) If we want to include data that is not used in SOE but is used elsewhere would it be in scope to add to the package?

andybeet commented 5 years ago

This is the exact discussion Sean and I had.

slarge commented 5 years ago

After discussing with @sgaichas and @seanhardison1, we decided that after the upcoming 2019 SOE release, ecodata will only hold processed time-series data. Time-series functions will live in a separate package. Versioning shouldn't be an issue because ecodata will depend upon the ecoanalysis(?) package.

seanhardison1 commented 5 years ago

@slarge how do we want to handle Geoms? Should plotting functions move to ecotrend as well?

slarge commented 5 years ago

I think geoms should probably live with the analytical functions in the ecotrend package (e.g., mgcv::plot.gam() and similar, where a plotting function or geom is created specifically for an analysis). Are there arguments against?

seanhardison1 commented 5 years ago

Works for me