Closed aaraney closed 4 years ago
@aaraney The reason that I am using NHDPlus is that I was planning to implement some functions similar to nhdplustools for getting the river network. But I found it to be time-consuming so I decided to for now use the R package to get the river network and geometry. Navigating NHDPlus requires ComID and the only easy and straightforward method that I could find to relate a USGS station ID to its corresponding ComID was to use the NHDPlus database. The two databases that the code uses to get the station metadata are small and the code downloads them automatically so they're not shipped strictly speaking.
There are some other ways of getting a watershed geometry and its characteristics, for example, the streamstats service but still for the river network data we need to use NHDPlus.
If interested you can work on implementing streamstats. There's already a python package for this purpose called streamstats that we can use. It returns the geometry as a GeoJSON file so we have to alter the LULC that uses the geometry for clipping the raster data.
@cheginit That makes sense, thanks for the explanation. I am approaching it from the end user case, im curious if we can source the NHDPlus files from within the site-packages install dir so the user doesn't have to "know" about" the location of that file. It would be interesting to see if we could also throw the NHDPlus contents that are necessary for ComID navigation into a pandas dataframe or some other similar format (numpy array, etc.) and save it as a pickle file and just ship the pickle file. I think we could cut two dependencies for the project by doing that. Although personally, I think if we can resolve a solution that does not include shipping the NHDPlus network to the end user that would be best case in my opinion.
Along the same lines, if you are just tracing upstream from a gauge station, I know Dave Blodgett shared with me at some point an end point to a USGS API that allowed you to give it the USGS gauge id and it would trace upstream I believe all of the ComID's? I will find it and share it back in this thread. That could also be another potential solution, however I need to do a more thorough code review regarding the NHDPlus networking we are completing.
Streamstats seems promising, thanks for including it. If we were to include it, what functionality would you like to see come from that package?
I agree. The NHDPlus should be a separate endeavor! So I refactored the code and removed all NHDPlus dependencies and used StreamStats (SS). I think we can rely on HUC8 to relate the data to NHDPlus.
Just one thing about the SS is that the watershed geometry is not exactly the same as NHDPlus and there some differences. I compared the results between SS and NHDPlus (using the R script in the repo) and the NHDPlus seems to be a little bit bigger and higher quality (more details). Maybe, Dave can help us with understanding the reason and maybe porting some of his tool's functionalities (nhdplustools) to python.
I implemented the NLDI so there's no need to deal with streamstats or direct download of NHDPlus. I'll close this, for now, feel free to open it if you think more needs to be done related to this issue.
Before I offer my suggestion, I may be missing the utility of shipping the NHD with the repo. With that in mind and if you don't mind elaborating later, what are your thought on moving away from shipping the NHDPlus dataset to users and instead relying on the USGS's api to verify and obtain gauge metadata? It should be a straight forward call that doesn't require a key.