DOI-USGS / mda.streams

backend tools for powstreams
Creative Commons Zero v1.0 Universal
3 stars 8 forks source link

update mda.streams handlers to deal with sbtools v0.16.0 #234

Closed jordansread closed 8 years ago

jordansread commented 8 years ago

see https://github.com/USGS-R/mda.streams/issues/233 for discussion. We know the result of the query is now a different format. Maybe other changes need to be evaluated and addressed.

aappling-usgs commented 8 years ago

diagnosing the extent of this problem. looks like locate by 'tag' works (whether the item exists or not), but locate by 'dir' does not (whether the item exists or not), and it always comes down to the same problem of that query now returning a list rather than a data.frame as it used to (https://github.com/USGS-R/sbtools/commit/0f454e131d7b49f32573c6ca3ec6e7c217f8e4a1).

# existing meta: by='tag' works, by='dir' does not
> locate_meta('basic', by='tag')
[1] "559c1bc8e4b0b94a6401792e"
> locate_meta('basic', by='dir')
Error in query_out[tolower(query_out$title) == tolower(query$title[1]),  : 
  incorrect number of dimensions
> locate_meta('basic', by='either') # tries tag first, stops b/c it works
[1] "559c1bc8e4b0b94a6401792e"

# non-existing meta: by='tag' works, by='dir' or 'either' does not
> locate_meta('notameta', by='tag')
[1] NA
> locate_meta('notameta', by='either')
Error in query_out[tolower(query_out$title) == tolower(query$title[1]),  : 
  incorrect number of dimensions

# existing ts: by='tag' works, by='dir' does not
> locate_ts('doobs_nwis', 'nwis_02322688')
[1] "5581c5cfe4b023124e8f38a4"
> locate_ts('doobs_nwis', 'nwis_02322688', by='dir')
Error in query_out[tolower(query_out$title) == tolower(query$title[1]),  : 
  incorrect number of dimensions
> locate_ts('doobs_nwis', 'nwis_02322688', by='either') # tries tag first, stops b/c it works
[1] "5581c5cfe4b023124e8f38a4"

# non-existing ts: by='tag' works, by='dir' or by='either' does not.
> locate_ts('par_nwis', 'nwis_02322688')
[1] NA
> locate_ts('par_nwis', 'nwis_02322688', by='dir')
Error in query_out[tolower(query_out$title) == tolower(query$title[1]),  : 
  incorrect number of dimensions
> locate_ts('par_nwis', 'nwis_02322688', by='either') # tries tag first, doesn't find so tries dir & breaks
Error in query_out[tolower(query_out$title) == tolower(query$title[1]),  : 
  incorrect number of dimensions

# all the same patterns for sites
> locate_site('nwis_07239450', by='either')
[1] "556f1f5ae4b0d9246a9fc695"
> locate_site('nwis_07239450', by='dir')
Error in query_out[tolower(query_out$title) == tolower(query$title[1]),  : 
  incorrect number of dimensions
> locate_site('nwis_0723', by='dir')
Error in query_out[tolower(query_out$title) == tolower(query$title[1]),  : 
  incorrect number of dimensions
> locate_site('nwis_0723', by='either')
Error in query_out[tolower(query_out$title) == tolower(query$title[1]),  : 
  incorrect number of dimensions
aappling-usgs commented 8 years ago

digging into locate_item

for tag we call query_item_identifier, which returns a data.frame whether the item exists or not:

> item <- query_item_identifier(scheme='mda_streams', type='site_root', key='nwis_0723', limit=5000)
> item
data frame with 0 columns and 0 rows
> item <- query_item_identifier(scheme='mda_streams', type='site_root', key='nwis_01646305', limit=5000)
> item
          title                       id
1 nwis_01646305 556f2978e4b0d9246a9fcfa1

for dir (or either when tag doesn't find it) we call query_item_in_folder, which used to return a data.frame like query_item_identifier but now returns a list of sbitems:

> query_out <- query_item_in_folder(text="nwis_0723", folder="5487139fe4b02acb4f0c8110", limit=5000)
> query_out
list()
> query_out <- query_item_in_folder(text="nwis_01646305", folder="5487139fe4b02acb4f0c8110", limit=5000)
> is.list(query_out)
[1] TRUE
> is.data.frame(query_out)
[1] FALSE
> query_out[[1]]
<ScienceBase Item> 
  Title: nwis_01646305
  Creator/LastUpdatedBy:      / 
  Provenance (Created / Updated):   / 
  Children: TRUE
  Item ID: 556f2978e4b0d9246a9fcfa1
  Parent ID: 
aappling-usgs commented 8 years ago

exhaustive search of project for 'query' shows that get_sites, get_watershed_WFS, list_datasets, list_metab_models, list_metab_runs, list_metas, and post_watershed all call query_item_identifier and assume it returns a data.frame. so all of these will need to be revised if/when https://github.com/USGS-R/sbtools/issues/194 gets resolved. but locate_item is the only function that uses query_item_in_folder, so there's only one code chunk that needs fixing today.

aappling-usgs commented 8 years ago

ok, pull request coming. all of the tests i ran earlier in this thread now come back as i'd hope:

> locate_meta('basic', by='tag')
[1] "559c1bc8e4b0b94a6401792e"
> locate_meta('basic', by='dir')
[1] "559c1bc8e4b0b94a6401792e"
> locate_meta('basic', by='either')
[1] "559c1bc8e4b0b94a6401792e"
> locate_meta('notameta', by='tag')
[1] NA
> locate_meta('notameta', by='either')
[1] NA
> locate_ts('doobs_nwis', 'nwis_02322688')
[1] "5581c5cfe4b023124e8f38a4"
> locate_ts('doobs_nwis', 'nwis_02322688', by='dir')
[1] "5581c5cfe4b023124e8f38a4"
> locate_ts('doobs_nwis', 'nwis_02322688', by='either')
[1] "5581c5cfe4b023124e8f38a4"
> locate_ts('par_nwis', 'nwis_02322688')
[1] NA
> locate_ts('par_nwis', 'nwis_02322688', by='dir')
[1] NA
> locate_ts('par_nwis', 'nwis_02322688', by='either')
[1] NA
> locate_site('nwis_07239450', by='either')
[1] "556f1f5ae4b0d9246a9fc695"
> locate_site('nwis_07239450', by='dir')
[1] "556f1f5ae4b0d9246a9fc695"
> locate_site('nwis_0723', by='dir')
[1] NA
> locate_site('nwis_0723', by='either')
[1] NA