weecology / MATSS

R Package for Macroecological Analysis of Time Series Structure (MATSS)
https://weecology.github.io/MATSS
Other
14 stars 8 forks source link

Add code for processing datasets from Popler #95

Open ha0ye opened 5 years ago

ha0ye commented 5 years ago

Popler is a package for obtaining LTER datasets in a (somewhat) standardized way. We are going to need code that processes the data into the format that we need it in for MATSS:

Obtaining the data files

Metadata

Covariates

ha0ye commented 5 years ago

@bleds22e has been working with the popler package, and will make a new branch for this work (so that everyone can help out).

ha0ye commented 5 years ago

(currently on hold; see #101) resolved

diazrenata commented 5 years ago

Currently, the next steps are:

ha0ye commented 5 years ago

Update on Popler data integration: • the LTER sites that are included in Popler's database each have their site-specific data transformed into Popler's format • this can contain mixtures of different data sampling schemes, so generating community time series data is non-trivial

thoughts on ways forward: • contact LTER data managers for already-prepared time series datasets • manually clean and transform each of (many) datasets ourselves • see if Popler has information on the backend about the different types of datasets it's pulling in from each LTER, maybe this allows us to more quickly filter for time series data (contact Aldo for this?) • what is the overlap of datasets with BioTime? (is it easier to try and get these datasets from BioTime?)

current status: • we are compiling some summary tables on how the different LTER sites have their data organized hierarchically within Popler (Popler calls these "spatial replication levels") • https://github.com/ha0ye/popler will eventually contain generated Rmarkdown reports for these summary tables (one report for each LTER dataset entry), to be uploaded once they are finished being generated

:dizzy_face: :tired_face:

ha0ye commented 5 years ago

@diazrenata also suggested we could do some digging through the source code for popler to see if that yielded any clues about how it might be processing data on its end.

ha0ye commented 5 years ago

@diazrenata also suggested we could do some digging through the source code for popler to see if that yielded any clues about how it might be processing data on its end.

It sounds like there may be unique code for importing each dataset into popler, so this may not be a feasible path to lessen the workload of manually dealing with each dataset.

I think our steps forward are:

ha0ye commented 4 years ago

This issue needs a decision one way or another (i.e. whether to try and include US LTER data via Popler for V1).