ncss-tech / soilDB

soilDB: Simplified Access to National Cooperative Soil Survey Databases
http://ncss-tech.github.io/soilDB/
GNU General Public License v3.0
81 stars 20 forks source link

SCAN/SNOTEL metadata #61

Open dylanbeaudette opened 6 years ago

dylanbeaudette commented 6 years ago

Should be possible to snag it here:

image

Asking Deb Harms to be sure.

dylanbeaudette commented 6 years ago

Seems to work (04ffe72e28239414ec5b78e40841e2c603f64ae5). Still need to test, document, and then explain in the related tutorial:

http://ncss-tech.github.io/AQP/soilDB/fetchSCAN-demo.html

@brownag : figured this would be a nice addition to CDEC-based eval of climate data in our part of the world.

Example:

# bleeding-edge version
library(soilDB)
x <- fetchSCAN(site.code=c(462, 697, 574), year=2016)

# new metadata element in list
x$metadata
            Name Site      State Network County Elevation_ft Latitude Longitude          HUC
275 Ebbetts Pass  462 California  SNOTEL Alpine         8661 38.54970 -119.8047 160502010104
507 Leavitt Lake  574 California  SNOTEL   Mono         9604 38.27594 -119.6128 160503020104
741  Poison Flat  697 California  SNOTEL Alpine         7736 38.50576 -119.6262 160502010102

# all stations can be seen with
data(SCAN_SNOTEL_metadata)

# or
SCAN_site_metadata(site.code)
dylanbeaudette commented 6 years ago

Another reminder to link back to the great work in https://github.com/mrke/NicheMapR

http://onlinelibrary.wiley.com/doi/10.1111/2041-210X.12148/abstract

https://camelunimelb.wordpress.com/resources/

dylanbeaudette commented 6 years ago

Still waiting to hear back from Deb on the mapping between SCAN/SNOTEL site code and KSSL pedon ID. We may be able to reverse-engineer this linkage via spatial proximity.

jskovlin commented 6 years ago

I just looked into this and the SCAN/SNOTEL station ID should map back to pedon ID will be via the site.climstaid field in the associated lab pedons. Will need to add that field to fetchNASIS/fetchKSSL. Probably easier to track them all back and save the corresponding pedon ID's in the scan metadata file in the package for easy access. One thing I'm not sure about is if the site.climstaid field has been consistently populated, my initial look indicates that it might not be. However, there are only just over a 1000 stations and not all of them may have corresponding lab data.

dylanbeaudette commented 6 years ago

@jskovlin Excellent detective work! I knew that there was a linkage somewhere in one of our databases. Do you have an example sites in mind that I can use to test? What we need is a NASIS or LIMS style query that pulls down the latest site / pedon records that have a non-null site.climstaid field.

jskovlin commented 6 years ago

Unfortunately, it has not been consistently populated......you'll see what I mean when you pull all these data in. Sometimes the field is null and there is mention to the scan/snotel site in the sitetext notes, or dare I say it not at all........

FROM nasis_site, nasis_group, site, OUTER(site_observation, OUTER (pedon, OUTER (transect))) WHERE site.user_site_id IMATCHES ? AND nasis_site.nasis_site_name IMATCHES ? AND nasis_group.group_name IMATCHES ? AND site.climstaid IS NOT NULL AND JOIN transect TO pedon AND JOIN pedon TO site_observation AND JOIN site_observation TO site AND JOIN nasis_group to site AND JOIN site to nasis_site

dylanbeaudette commented 6 years ago

Nice. Looks like there are a couple relevant fields then. For now, it might be best to filter on climstatype IN ('SCAN', 'SNOTEL').

We might need @smroecker help on writing a clever LIMS report that we can periodically hit.

dylanbeaudette commented 6 years ago

As of 5d9dcf22c7a7e33577be271db6f817b696a1c845, there are some more detailed metadata for SCAN/SNOTEL sites in the Western US. Thanks to Jay and Kent Sutcliffe!

library(soilDB)
x <- fetchSCAN(site.code=c(462, 697, 574), year=2016)

knitr::kable(x$metadata, row.names = FALSE)
Name Site State Network County Elevation_ft Latitude Longitude HUC climstanm upedonid pedlabsampnum
Ebbetts Pass 462 California SNOTEL Alpine 8661 38.54970 -119.8047 160502010104 NA NA NA
Leavitt Lake 574 California SNOTEL Mono 9604 38.27594 -119.6128 160503020104 Leavitt Lake S10CA051002 11N0377
Poison Flat 697 California SNOTEL Alpine 7736 38.50576 -119.6262 160502010102 NA NA NA
brownag commented 6 years ago

Excellent to see all the progress that has been made here.

Unfortunately, can't allocate much mental real estate to this at this time. Have a lot on my plate with CA630.

For the record, my intersectHorizon code is pretty basic and is set up to just return ph/chiid/labsampnum corresponding to one or more horizons occurring at a particular depth, or a range of depths. I designed it to be used internally within functions that would deal with manipulation of the actual horizon data.

A LEFT JOIN-like intersect would be a bit more involved, but would probably also be friendlier to play with for general use. For the purposes of linking sensor depth (single value) to their horizon properties, that is simple enough. But what about the case where a depth interval is specified/desired? Say, you wanted to estimate the thermal properties of all horizons between 0 and 50cm... Truly JOIN-ing would result in duplication of the "site" record you are joining to if you intersect multiple horizons... unless you do something to aggregate (like depth.wtd.average). Perhaps the option to aggregate would be a neat feature for intersection over an interval... !

Just my 2c.

dylanbeaudette commented 6 years ago

Got some updated metadata from Steve Campbell (SCAN pedon IDs). Added a new function for attempting to cross-reference user pedon ID with lab ID via LIMS report.

https://nasis.sc.egov.usda.gov/NasisReportsWebSite/limsreport.aspx?report_name=Pedon+Description+html+(userpedid)&pedon_id=S11UT043001

See changes in related commit for details.

Still need to verify merging of various metadata sources:

dylanbeaudette commented 6 years ago

Another source of data, c/o @mrke

https://github.com/mrke/NicheMapR/blob/master/data/SCANsites.RData

dylanbeaudette commented 3 years ago

SCAN / SNOTEL site metadata update.

dylanbeaudette commented 2 years ago

Updates, after starting a conversation with Melissa Webb (SCAN/SNOTEL).

Cross-reference IDs:

library(soilDB)

SDA_query("SELECT * FROM lab_pedon WHERE pedon_key = '18827' ;")
SCAN_site_metadata(site.code = 2001)

Note that SCAN station is described by name in a site text note vs. "Climate Station ID" field (site.climstaid): image

Note that MLRA-owned pedon contains correct link back to LDM/KSSL, missing "certified" checkbox: image

TODO:

brownag commented 2 years ago

In https://github.com/ncss-tech/soilDB/commit/44d3a2cd6dec996052b37d93b46fdee3c88597e1#diff-0ecbf43726e8786ff52fe0996cddbff049fae27b910e3ce6c9b7a0c0c125d4be added pedlabsampnum to the following SCAN stations that were previously missing it

Site upedonid pedlabsampnum
2092 04KS035001 04N0873
2067 D99PR055007 07N0374
2177 S10AL101001 10N1241
2094 04KS131100 04N0874
2213 S2014AK180002 15N0704
2189 S2012CA079003 13N97488
2107 S2004NM025001 04N0399
2190 S2012CA027001 13N97484
2187 S2012CA027002 13N97485
2191 S2012CA051002 13N97486
2192 S2012CA063001 13N97487
2185 S2011CA071003 13N97465
2173 S10AL033001 10N1238
2212 S2014AK290011 15N0705
2201 S2012TX275001 12N8036
2106 S2004TX079001 04N0398
2105 S2004TX219001 04N0397
2169 S09NM061001 10N0134
2178 S10AL087001 10N1240
2178 S10AL087001 10N1240
2093 04KS147001 04N0875
2104 S2004TX303001 04N0396
2176 S10AL047001 10N1239
2186 S2011CA071004 13N97466
2179 S10AL133001 10N1243
2175 S10AL111001 10N1242
2108 S2004NM025002 04N0400

The following SCAN stations that are missing a related lab pedon have a KSSL pedon within 500m of the station location (via fetchKSSL(bbox=...). Not all of these are good matches--which might suggest one or more of the SCAN site/pedon coordinates are not accurate, or that the pedons accessible via fetchKSSL() are incomplete.

Site pedon_id pedon_key pedlabsampnum distance
2216 1998NV031163 24200 99P0166 88.09832 [m]
2224 1992MO051002 48788 M9205102 474.31598 [m]
2124 91VI020012 18620 91P1200 440.72794 [m]
2218 71-CA-45-169x 52934 UCD7145169 127.95896 [m]
2147 S2006KS103600 33574 06N1026 470.99138 [m]
2228 74TX347001 4397 40A4643 146.18007 [m]
808 S07MT031001 34591 08N0151 20.02456 [m]
dylanbeaudette commented 2 years ago

Good work. I've got some additional metadata from the SCAN folks, will drop it into the folder in /misc/ where we have been working. There will be several levels of cross-referencing that will have to happen. Will try to add the additional kinks described by Kent Sutcliffe soon.