WikiWatershed / rapid-watershed-delineation

Rapid Watershed Delineation Code for MMW2
Apache License 2.0
12 stars 6 forks source link

Update NHD data #46

Closed kdeloach closed 7 years ago

kdeloach commented 7 years ago

The data processing for the Continental US (lower 48 states) is complete and ready to deploy. It is 500 GB in 84259 subwatershed folders posted in 86 zip files (85 subwatersheds and one top level zip file) in https://drive.google.com/drive/folders/0B7V8il12WGQJM1JFWkN6bXpOX1k. (Sorry I forgot to change the prefix MS before zipping, so do not be mislead. The files are for the US, not only Mississippi region).

Ref: https://github.com/WikiWatershed/rapid-watershed-delineation/pull/44#issuecomment-264650874

mmcfarland commented 7 years ago

We may be able to reduce the overall footprint of the data by recursively walking the directories and enabling compression on all of the tifs. I believe the top level tif is already compressesd, but the rest didn't seem to be for region 2 data.

ajrobbins commented 7 years ago

@dtarb has made a change in the data pre-processing to account for "holes" in the watersheds. This has resulted in new data, which is in a file called "Simple.zip" in the Google Drive folder referenced above.

Instructions from David:

To use this you will need to unzip this file retaining the folder structure putting each file in the corresponding folder of the data you already received. Then there is one line of code to change in RWD that changes the name of the files being used. I already pushed this into the branch I created in github.

kdeloach commented 7 years ago

The combined file size for this new set of data is about 575 GB. This is based on the sum of the file size column of the NHDPlus/RWDContUSA folder in GDrive.

kdeloach commented 7 years ago

I have completed the download of NHD data and have begun verifying it against the latest RWD code changes.

I'm also downloading the NHD data to an ec2 instance. It will be quicker to process the data directly from AWS instead of uploading the processed files from our local network.

kdeloach commented 7 years ago

I downloaded the NHD files to my workstation and to an ec2 instance. I'm still working on merging this data with the DRB files and verifying that everything works with the latest code changes.

kdeloach commented 7 years ago

Splitting the remaining work up into separate issues.