cont-limno / LAGOSNE

Interface to the LAke multi-scaled GeOSpatial & temporal database :earth_americas:
https://cont-limno.github.io/LAGOSNE/
15 stars 8 forks source link

Provide user with all column names within each table + metadata #14

Closed limnoliver closed 7 years ago

limnoliver commented 8 years ago

@jsta where is the best place for this information? As a user, I could imagine wanting this to be in two places: 1) some sort of documentation listing each variable within each table, along with some metadata (units, plain English description, etc). We could have documentation for each table, but where (in the package structure) would this go? 2) in a table format, similar to the info table that is currently imported with the rds file.

jsta commented 8 years ago

@limnoliver As far as your option 1, we could write roxygen blocks (http://stackoverflow.com/questions/2310409/how-can-i-document-data-sets-with-roxygen) in the LAGOS-package.R file. I added an example for the iws table (3f52b46). See https://github.com/jsta/LAGOS/issues/7.

jsta commented 8 years ago

I think we should provide the information in your option 2 as a figure in the README or in a vignette introducing the package.

limnoliver commented 8 years ago

@jsta - I really like this implementation. I think it would be super useful to have the plain english description here, along with units, for a really easy place for people to access this info. Would we document every table in this file (LAGOS-package.R)?

jsta commented 8 years ago

It makes sense to me to have it all in LAGOS-package.R. I don't think it would cause any problems.

jsta commented 8 years ago

I got the idea from http://stackoverflow.com/a/7088603/3362993

jsta commented 8 years ago

Looks like this file has the column definitions we need for GEO: CSI_LIMNO_Manuscripts-presentations/CSI_DATA paper/Metadata files/LAGOSGEO_Field_List.xlsx

limnoliver commented 8 years ago

Yes, that has some of the variables. I had a chat with Pat and Kathy, and I showed Kathy the structure of the metadata. She is going to go ahead and populate a limno and geo metadata table, so we can ignore this for now and use Kathy's product.

We also talked about how the variables are redundant, and working in a "scale" option, along with "variable type" option would be a useful way to think about the database for the user. So in lagos_select, it would be something like lagos_select(type = c(limno, geo), scale = c(HU4, HU6, HU8...), subset = c(deposition, hydrology, water quality))

jsta commented 8 years ago

I like the idea. Very user-friendly. Do we want to retain the current ability to specify an exact table by name? I could make the table_column_nested object optional.

limnoliver commented 8 years ago

Yeah, I think we should have table_column_nested be optional where users can still give specific table and column combos.

limnoliver commented 8 years ago

Also, it would be nice if this was and/or (specifying table_column_nested, that is) so that if you specify a specific column from a table, plus want all huc4 deposition variables, you can combine the two command options.