Open ChristinaB opened 4 years ago
The Notebook view shows the imshow plot from the DHSVM ascii model output for the first two years.
This is the view with array and mapped to UTM coordinates.
It's not flipped! This is the Landlab input for SCL domain extent 30m grid.
Known issues:
[ ] The dates read from the DHSVM output are saved as strings, so when they are added to the NetCDF they are unusable for time and mapping. This will need to be fixed to do most visualization with the coolest easiest tools.
[ ] Ronda and Nicoleta both came up with UTM coordinates and I can not deciper how. @NCristea can you describe how/why/where the grid pickle was created? If we don't have that or there is an issue with it, the entire workflow is busted. Ronda - where did you get your numbers? I have them copied to the Notebook to make it easy to compare with Nicoleta's numbers - currently they are all differnt. If you go down to Section 4. Reshape 150m grid to 30m you can use the high res grid and we can be sure it is correct.
This code does work from xarray pre-netcdf export. The indexing does not work from the netcdf imported with the code we were trying to user earlier. However, this takes too long to run and we may need to recode with a loop.
counter=0
for j in range(len(x)):
for k in range(len(y)):
one_location=dsi.isel(x=[j],y=[k]).to_array()
loc1list=np.array(one_location.variable)
b = list(itertools.chain(*loc1list))
c = list(itertools.chain(*b))
d = list(itertools.chain(*c))
HSD_dict_annualmaxDWT_hist[counter] = {keys[counter]:d}
counter=counter+1`
@NCristea Could you please put the future netcdf files in a folder here? https://www.hydroshare.org/resource/767e35f896a94023b25f788701bad641/ Thank you for processing those. I will test them and hopefully it is smooth sailing from here.
On visualization: Do you have xrviz working on your desktop? Does it allow recording? then we don't need to code an animation, but record a video of a screen.
The numbers for the corners (lon, lat) I got from looking the GIS corners (very closely) for the input rasters that I converted to ASCII (starting from the Phi raster). The node id corners were done by math knowing the 30m grid rows and columns.
The files are > 2GB in size, Hydroshare takes files <1 GB.
On Thu, Dec 5, 2019 at 10:09 AM Ronda Strauch notifications@github.com wrote:
The numbers for the corners (lon, lat) I got from looking the GIS corners (very closely) for the input rasters that I converted to ASCII (starting from the Phi raster). The node id corners were done by math knowing the 30m grid rows and columns.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Freshwater-Initiative/SkagitLandslideHazards/issues/29?email_source=notifications&email_token=AD6YK2FWU7XEFCE6F777MH3QXE7K7A5CNFSM4JV2D6S2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGBTCBI#issuecomment-562245893, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD6YK2BSFOUSKXRYCQOJBHDQXE7K7ANCNFSM4JV2D6SQ .
--
Nicoleta Cristea Research Scientist University of Washington eScience Institute & Department of Civil and Environmental Engineering
Well, that would be a good reason! ugh, then we'll have to come up with another approach. we're trying a different compression.
My experiment:
The future files are 3.5 G uncompressed.
Download from Google Drive manually
(Unzip manually the Windows .zip file)
From terminal:
gzip dtw_G_CNRM_CM5__rcp45.nc Result is 1.6G
tar -cvf dtw_G_CNRM_CM5rcp45.nc.gz.tar dtw_G_CNRM_CM5__rcp45.nc.gz dtw_G_CNRM_CM5rcp45.nc.gz Result is still 1.6G
New plan is to look at Ronda's idea to address significant digits. We only need 3. This will make a big difference.
@NCristea Can we coordinate on running your code to manage dtype?
http://xarray.pydata.org/en/stable/io.html
These parameters can be fruitfully combined to compress discretized data on disk. For example, to save the variable foo with a precision of 0.1 in 16-bit integers while converting NaN to -9999, we would use
encoding={'foo': {'dtype': 'int16', 'scale_factor': 0.1, '_FillValue': -9999}}.
Compression and decompression with such discretization is extremely fast.
encoding (dict, optional) – http://xarray.pydata.org/en/stable/generated/xarray.Dataset.to_netcdf.html
Nested dictionary with variable names as keys and dictionaries of variable specific encodings as values, e.g.,
{'my_variable': {'dtype': 'int16', 'scale_factor': 0.1, 'zlib': True}, ...}
To solve the problem sooner, the values.dtype is float64 when read in from Map file. I'll see about making the numpy arrays set to 3 sig digits only
This look like a way to limit to 3 decimal place precision.
value= str(round(value, 3))
@NCristea - could you update your code so the first array that it is read in from the DHSVM Map file is then limited to 3 sig digits? This approach is slow but it works.
`values_sig3 = values
`for i in `range(values.shape[0]):
for j in range (values.shape[1]):
values_sig3[i,j]=round(values[i,j],3)
I don't see the future processing code on https://github.com/NCristea/SkagitLandslideHazards/ so I'm not sure what updates you have made since process_wt_grids_with_vis.py 3 days ago.
I am uploading one future file to cuahsi.jupyter.org where I am running my version of the code so I can test your future files. Do you expect them to be flipped? I don't know if there are other changes needed until we can use the files. 25% upload.....completing .....eventuallly
@RondaStrauch I am still trying to execute SCL_pickly_hydro.py to save the wt outputs. Note at 30 minutes each, this will be a chore not to repeat. I have restarted the server a few times put my script gets 'killed'. I may try on another server... otherwise if the smaller digits make a difference, then that would impact this step as well.
@RondaStrauch @NCristea
The future data is not flipped. The remaining issues that need to be addressed at the netcdf building is having the date string converted to python readable date. Then I think we can do more with xarray to solve the dictionary size issue.
I changed my script to start with rounded arrays, and last time I tried it still broke this server. I will ask Tony if we can get more compute access. It doesn't seem like rounding made it much smaller.
If we can't access the future data, than let's increase the historical by fraction.
At Newhalem, annual precipitation is projected to increase by 6% by 2050s. We could use that as a first cut.
Hi Christina,
You can convert the strings to date using this as an example:
date_string = '01/29/2099-21'
date_object = datetime.strptime(date_string, '%m/%d/%Y-%H')
print(date_object)
You can loop through the record with:
dates_str = dsi.time.values record = len(dates_str) dates_date = [] for i in range(record): date_object = datetime.strptime(dates_str[i], '%m/%d/%Y-%H') dates_date.append(date_object)
I have uploaded a gif with an example animation of 10 images on google drive. The size is small, I think it could be used in the ePoster.
Nicoleta
On Fri, Dec 6, 2019 at 10:13 AM Ronda Strauch notifications@github.com wrote:
If we can't access the future data, than let's increase the historical by fraction. At Newhalem, annual precipitation is projected to increase by 6% by 2050s. We could use that as a first cut.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Freshwater-Initiative/SkagitLandslideHazards/issues/29?email_source=notifications&email_token=AD6YK2GCF3QXLSQ7H22Y5HDQXKIUNA5CNFSM4JV2D6S2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGE5CAA#issuecomment-562680064, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD6YK2DVU2QGIORRLMVSHHTQXKIUNANCNFSM4JV2D6SQ .
--
Nicoleta Cristea Research Scientist University of Washington eScience Institute & Department of Civil and Environmental Engineering
I tried this but it needs to be included in the entire workflow or it breaks the rest of the workflow.
@RondaStrauch
For easy downloading - files on HydroShare Here is the netcdf input with synthetic grid- uniform distribution - landslide component 1202)
For viewing completed run Notebook - the same file is on Github Uniform Synthetic
For easy downloading - files on HydroShare Here is the SCL model grid- uniform distribution - landslide component 1202)
For viewing completed run Notebook - the same file is on Github Uniform SCL grid
This is the lognormal distribution, synthetic grid working example with 1202 component
For viewing completed run Notebook - the same file is on Github Lognormal Synthetic
For downloading - This is the lognormal distribution, SCL - may be working -need more time- 1202 component
For viewing completed run Notebook - the same file is on Github lognormal distribution, SCL - may be working -need more time- 1202 component
To upload.
For viewing completed run Notebook - this file is on Github data driven and array saving distribution, SCL 1202 component
For downloading - from HydroShare data driven, SCL 1202 component
@RondaStrauch Could you take a look at the error in the SCL uniform notebook? This is all updated to run with fire and the netcdf input - -- but this error I think is related to the T input. Do you agree?
@RondaStrauch For the lognormal spatial notebook - I think it runs fine, but it takes a long time and I had to interrupt the testing. Could you restart it on your end?
Note on running 3 comments up: Couldn't run the synthetic uniform because missing synthetic.nc file
Added this to the landslide_probability_20191202.py at line 729 if self.groundwater__recharge_distribution is not None: The ran the SCL uniform and lognormal. Connection fails before completing.
@RondaStrauch Did you update the 1202 file or do you want me to do that? You could put the most recent date on Github.
@ChristinaB the current .py is _20191206, so that is up on the hydroshare resource and need to be the current one in the first code block of any notebook. I'll copy it over to github.
@ChristinaB - Oops, look like the latest is _20191208! not 06
@ChristinaB - can you load the mean and standard deviation text files with names like below into the ASCII folder in the Slippery Future Data resource? "dtw_mean_hist.txt" "dtw_stndev_hist.txt" "dtw_mean_fut.txt" "dtw_stndev_fut.txt" Which can be read in as arrays like this: dtw_mean_h = np.loadtxt("dtw_mean_hist.txt") dtw_stndev_h = np.loadtxt("dtw_stndev_hist.txt") and then called in the lognormal-spatial call in the LandslideProbability()
@ChristinaB - I've also added a mask for the nodes around the Goodell Fire such that we can limit these as core nodes for calculation and maybe for plotting.
(grid1, fire_mask) = read_esri_ascii(data_folder+'/scl_firebox.txt') grid.add_field('node', 'fire_area', fire_mask) grid.set_nodata_nodes_to_closed(grid.at_node['fire_area'], -9999)
@ChristinaB - Added the fire area mask to the lognormal spatial notebook and added place holder for loading of dtw mean and standard deviation arrays that are in the ASCII folder.
SCL_lognormal_spatial_landslide_20191209.ipynb
By using the fire area mask to close nodes outside the fire Goodell creek fire, we should be able to cut down the processing time. Then we can show figures closer up to the fire to see changes in landsliding probability.
To dos:
1) generate dtw mean and stndev for historic and at least one future
2) add these as text files to the ASCII folder
3) check the notebook for correct naming
4) run the notebook 2 times (historical and future)
5) save figures we want to make pretty pictures for poster!
@RondaStrauch Try to run from 20200331_map2netcdf2array_lognormal_spatial_Depth_SCL_LandlabLandslide.ipynb from https://hydroshare.org/resource/4cac25933f6448409cab97b293129b4f or click on link below and view https://hydroshare.org/resource/4cac25933f6448409cab97b293129b4f/data/contents/20200331_map2netcdf2array_lognormal_spatial_Depth_SCL_LandlabLandslide.ipynb
Updates: