I am hoping you can help me be a better collaborator and better user of the temperature data. If I need to pull surface temperatures for a given DOW/date (such as the attached list), what is the best way for me to do that? Do you mind sharing your code for doing this? My current protocol would be to download/save the whole data release and then extract what I needed, but that doesn't seem like the most efficient way.
my response in email so far:
unfortunately, I don’t think there is a simple way for you to do this right now, and that is an oversight on the way the data release is set up.
Everything is keyed off of the lake identifier (site_id in the release) and the group_id, which tells you which file(s) you need to download when files in the sciencebase release are large and therefore separated into several zip files (e.g., the raw daily temperature predictions). That information, as well as the lake name (if it appears in NHD’s GNIS_name) is in the lake_metadata.csv.
But the problem that our set-up doesn’t solve is the connection to the state identifiers. You’d need to know which NHDID (site_id) you want for each MNDOW, and use that to figure out which zip groups to download, unzip, and then which .csv within that file to read. It is probably frustrating to hear that we do have those crosswalks in our pipeline for processing the data release, but they aren’t available in the release
I think we need to do two things to support your workflow: 1) add a crosswalk table to the data release, and 2) provide some example code on how to use that to download and access the data you need for MNDOWs
from GH:
my response in email so far:
unfortunately, I don’t think there is a simple way for you to do this right now, and that is an oversight on the way the data release is set up.
Everything is keyed off of the lake identifier (site_id in the release) and the group_id, which tells you which file(s) you need to download when files in the sciencebase release are large and therefore separated into several zip files (e.g., the raw daily temperature predictions). That information, as well as the lake name (if it appears in NHD’s GNIS_name) is in the lake_metadata.csv.
But the problem that our set-up doesn’t solve is the connection to the state identifiers. You’d need to know which NHDID (site_id) you want for each MNDOW, and use that to figure out which zip groups to download, unzip, and then which .csv within that file to read. It is probably frustrating to hear that we do have those crosswalks in our pipeline for processing the data release, but they aren’t available in the release
And here are all of the cross walks we have currently:
I think we need to do two things to support your workflow: 1) add a crosswalk table to the data release, and 2) provide some example code on how to use that to download and access the data you need for MNDOWs