internetofwater / nldi-services

Network Linked Data Index Navigation Web Services
https://waterdata.usgs.gov/blog/nldi-intro/
Creative Commons Zero v1.0 Universal
19 stars 15 forks source link

Catchment Characteristics #38

Closed dblodgett-usgs closed 7 years ago

dblodgett-usgs commented 8 years ago

Working off of Mike W's dataset available for review in ScienceBase: https://www.sciencebase.gov/catalog/item/5669a79ee4b08895842a1d47

Some basic aspects of the data:

General Notes: As far as the NLDI is concerned, each of the characteristics should be attached to a COMID and exposed as is applicable.

For now, we will set aside the time-series characteristics. They need to be exposed as time-series objects and will need a bit different handling.

  1. Single Local Catchment. Should expose local catchment characteristics.
  2. Navigate/UT query without distance (basin) query. Should expose total drainage area characteristic.
  3. Basin query with ‘divergence routed characteristics’ flag set to true. Should expose divergence routed drainage area characteristic.

Each characteristic can be organized in a dataset and each dataset can be organized in a theme.

Each characteristic has an ID, a description, and units.

Each dataset has a title, a science base item, and a theme.

DRAFT Web API Idea (Edited 10-31-2016) /{featureSource}/{featureSourceId}/ returns known feature information including the geometry, url for navigate, and basin methods.

/{featureSource}/{featureSourceId}/navigate/UT/basin or... /{featureSource}/{featureSourceId}/basin returns the polygon for the total upstream basin and the .../chars option.

/{featureSource}/{featureSourceId}/.../chars (chars applies to featureSourceID's catchment or the basin function) Returns urls for ...chars/all, ...chars/meta, or chars/{filter}?

/{featureSource}/{featureSourceId}/.../chars/meta returns a list of available characteristics that has a bunch of useful information for a client to build a list and/or point to additional metadata.

The characteristics metadata table would include fields such as: Characteristic ID, Characteristic Description, Characteristic units, Theme Label, Theme URL, Dataset Label, Dataset URL.

/{featureSource}/{featureSourceId}/.../chars/all returns all characteristics

/{featureSource}/{featureSourceId}/{watershedType}/chars/{filter} returns catchment characteristics for the selected catchment.

/{featureSource}/{featureSourceId}/{watershedType}/chars/{filter}?diverted=T/F divergence routed characteristics available using the diverted flow flag.

dblodgett-usgs commented 8 years ago

Three relevant gists written in R: Download all the data https://gist.github.com/dblodgett-usgs/5b2e0f4cc025ea5100ec0979272f31c5 Cleanup Metadata and comb ScienceBase https://gist.github.com/dblodgett-usgs/045a85a270006d149ccdeb72fa08d990 Parse Apart all the Metadata https://gist.github.com/dblodgett-usgs/7fe1a8625f94212d364a738f5c9ed121

Also, a note for later, pgfutter command to load this data looks like:

Sys.setenv(DB_NAME="nldi", 
           DB_HOST="localhost", 
           DB_PORT="5432", 
           DB_USER="dblodgett", 
           DB_SCHEMA="cat_atts")
#./pgfutter --user "dblodgett" --db "nldi" --schema "test" csv ECOL3_ACC_CONUS.TXT ```
dblodgett-usgs commented 8 years ago

For data management, I'd propose addition of four tables to the NLDI database.

Probably add a new schema called characteristics and put the tables in there.

One question is what the table columns should be. The data is currently in wide format but I would imagine a long format would be better for the database. I think using the same data type for the value will be OK, but I should verify that a float is reasonable for all of them. There's a chance we'll need to handle categorical values, in which case we'll probably have to have seperate tables.

I'm thinking: comid, attribue, value

dblodgett-usgs commented 8 years ago

Should probably not default to total drainage area. Total drainage area may be appropriate for some questions, but it's really not the answer for other questions. If a question has to do with something that scales with streamflow, you probably want divergence routed. However, we don't know all the divergences or how much water is flowing through them all the time.

dblodgett-usgs commented 8 years ago

Making good progress on variable / metadata reconiliation. The script here: https://gist.github.com/dblodgett-usgs/a0a08a35c1d050ca7601fcdbd92a4a9c

shows all the corrections that needed to be made to get things in order.

dblodgett-usgs commented 7 years ago

Streamstats dev documentation for similar services is here: http://ssdev.cr.usgs.gov/streamstatsservices/#

dblodgett-usgs commented 7 years ago

Closing this ticket for now. Will open a new one summarizing outcome of several threads based on this.