Open BahramKhazaei-NOAA opened 1 year ago
@BahramKhazaei-NOAA The dem file has "nan" values, which may cause this error because pyDEM takes "-9999" as nodata. The gdal command to change nodata value cannot process this file:
gdal_calc.py -A dem_with_ocean.tif --calc="numpy.where(A<-100000, -9999, A)" --outfile dem_with_ocen_nodata.tif --NoDataValue=-9999
Do you have any method to change 'nan' to -9999? @BahramKhazaei-NOAA @SorooshMani-NOAA
@BahramKhazaei-NOAA @SorooshMani-NOAA Find a method to do it: https://gis.stackexchange.com/questions/163685/reclassify-a-raster-value-to-9999-and-set-it-to-the-nodata-value-using-python-a/163705#163705
@BahramKhazaei-NOAA @SorooshMani-NOAA Find a method to do it: https://gis.stackexchange.com/questions/163685/reclassify-a-raster-value-to-9999-and-set-it-to-the-nodata-value-using-python-a/163705#163705
Thank you. I was trying with QGIS, but it crashed. Did that work for you?
@BahramKhazaei-NOAA Yes, just use the last rule:
# Set large negative values to -9999
#Z[Z<-9999]= -9999
# Choose your preference: (comment either rule)
#Z[Z==-9999]= np.nan
# Or
Z[np.isnan(Z)]= -9999
I'll try this out and let you know if we still see issues. Thanks @cuill
I'll try this out and let you know if we still see issues. Thanks @cuill
Thanks, Soroosh.
@cuill using GMRT DEM, I read the raster values and set all nan's to -9999, but I still get the same error as before:
Traceback (most recent call last):
File "/lustre/meshing/river/extract/pyDEM_Samples/Bahram/run_serial.py", line 38, in <module>
gdf = S.write_shapefile(npt_smooth=3)
File "/home/Soroosh.Mani/sandbox/RiverMeshTools/pyDEM/pyDEM/dem.py", line 2402, in write_shapefile
idxs=np.squeeze(np.argwhere(np.isnan(river[:,0])))
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed```
This is my raster now:
In [5]: with rio.open('./fixed_gmrt.tif', 'r') as ds:
...: v = ds.read(1)
...:
In [6]: v
Out[6]:
array([[ -31.025713, -31.001385, -30.999046, ..., 1909.2639 ,
1822.1892 , 1735.6036 ],
[ -30.996735, -31.061066, -31.002304, ..., 1906.1929 ,
1839.0338 , 1778.5045 ],
[ -30.894072, -30.888163, -31.011707, ..., 1883.3633 ,
1867.0518 , 1846.1737 ],
...,
[-5467.1333 , -5472.786 , -5457.6455 , ..., -3490.741 ,
-3495.3052 , -3498.335 ],
[-9999. , -9999. , -9999. , ..., -9999. ,
-9999. , -9999. ],
[-9999. , -9999. , -9999. , ..., -9999. ,
-9999. , -9999. ]], dtype=float32)
Thank you Soroosh for testing again. Looks like river[:,0]
array is defined as a 2D array in dem.py
while a 1D array is passed to the function.
@cuill, any thoughts on this error message?
@BahramKhazaei-NOAA I just came back from vacation. I'll take a look at this error when I get a chance.
@BahramKhazaei-NOAA I just came back from vacation. I'll take a look at this error when I get a chance.
Appreciate your help and quick responses as always. Looks like for this error something is not compatible wit the definition of the river
array.
@BahramKhazaei-NOAA @SorooshMani-NOAA I cannot reproduce this error but I got another error related to the package GEOS (see issue here https://github.com/shapely/shapely/issues/1345) if using shapely2.0. So, please upgrade GEOS to 3.12.0.
I've pushed the version that worked on my side, please pull it. Here are the thalwegs extracted with a threshold of 1e4.
@cuill thanks for your feedback, so you just used the original DEMs and set all the nans to -9999 and ran pyDEM?
@SorooshMani-NOAA correct.
Hi @cuill I was able to install pyDEM
and run the both serial and parellel test cases provided here. However, when I try to run it based on my DEM it gives me the same error as Soroosh. Here is the error:
Traceback (most recent call last):
File "/work2/noaa/nos-surge/bkhazaei/work/run_dir/Alaska_Modeling/pre_processing/grid/bearing_sea_ocsmesh/pyDEM/run_serial.py", line 47, in <module>
gdf = S.write_shapefile(npt_smooth=3)
File "/work2/noaa/nos-surge/bkhazaei/envs/miniconda3/envs/RiverMesh3_10/lib/python3.10/site-packages/pyDEM/dem.py", line 2400, in write_shapefile
idxs=np.squeeze(np.argwhere(np.isnan(river[:,0])))
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed
I made sure that the nan values in my DEM are replaced with -9999 as we discussed above. I'm using run_serial.py
that is provided for the savannah river test case and didn't change anything in that script except for the path to the DEM file. I'm pasting the script here:
#!/usr/bin/env python3
import sys
import glob
import time
from pylib import *
from pyDEM.dem import *
if __name__ == '__main__':
#input tiff files
PathToInputDEMs = '/work2/noaa/nos-surge/bkhazaei/work/run_dir/alaska_domain/mesh/ocsmesh/cook_inlet/DEMs_inputs'
PathToDomainBoundary = PathToInputDEMs+"/GMRTv4_1_1_20230815topo_NaN_removed.tif"
print('Path to DEM input:', str(PathToInputDEMs))
files = glob.glob(PathToDomainBoundary)
files.sort()
t0 = time.time()
for fname in files:
names = [fname]
#output filename
sname = f"./{fname.split('.')[0].split('/')[-1]}_1e7"
print(sname)
#declaer a dem object
S=dem()
#read data
if not os.path.exists('{}.npz'.format(sname)):
S.proc_demfile(names,sname=sname,depth_limit=[-100,1000],subdomain_size=2e10)
S.read_data('{}.npz'.format(sname))
#compuate watershed information
S.compute_watershed()
#extract river network (area_limit: catchment area)
S.compute_river(acc_limit=1e7)
#write shapefile for river network
gdf = S.write_shapefile(npt_smooth=3)
gdf.to_file(f'{sname}.shp')
print(f'It took {(time.time() - t0)/60} mins!')
Did you use the same script to get the result you shared above? If not, could you please share that with me?
@BahramKhazaei-NOAA Yes, I used the same script but with a smaller threshold (acc_limit=1e4). As you can see from the thalwegs above, the threshold of 1e4 has a relatively coarse thalwegs. I think your threshold of 1e7 results in zero thalwegs which causes the error. I will change the script to throw the error if this happens. At the time being, you can test with different threshold.
@BahramKhazaei-NOAA Yes, I used the same script but with a smaller threshold (acc_limit=1e4). As you can see from the thalwegs above, the threshold of 1e4 has a relatively coarse thalwegs. I think your threshold of 1e7 results in zero thalwegs which causes the error. I will change the script to throw the error if this happens. At the time being, you can test with different threshold.
Thank you for your response. I'm going to try with your recommended acc_limit.
@cuill I was able to reproduce your results with acc_limit=1e4. Thank you for your help.
@cuill @feiye-vims I'm running the pyDEM
serial script for this GMRT DEM in northern pacific and Bering Sea. It does a good job in extraction of main channels over land and nearshore, yet in most of underwater areas I get these segments of straight lines that doesn't seem to be natural. I tried to play with the parameters in the script (acc_limit and depth limits), however, I couldn't change these patterns. Do you have a solution for this?
@BahramKhazaei-NOAA Normally, we don't need thalwegs in the water areas (e.g., lakes, ponds) to guide meshing. We deleted these straight lines with the pre-defined polygons.
@BahramKhazaei-NOAA It was done in either ArcGIS or QGIS.
@BahramKhazaei-NOAA Normally, we don't need thalwegs in the water areas (e.g., lakes, ponds) to guide meshing. We deleted these straight lines with the pre-defined polygons.
I see. By pre-defined polygons
you mean a polygon that covers the area in which the thalwegs are not needed like the under water areas?
@BahramKhazaei-NOAA No, in the opposite way, a polygon that covers areas where thalwegs are needed. Then after clipping, thalwegs outside the polygon are removed.
@BahramKhazaei-NOAA No, in the opposite way, a polygon that covers areas where thalwegs are needed. Then after clipping, thalwegs outside the polygon are removed.
Sounds good, thank you for your quick response.
When we created the RiverMeshTool, we already had polygons manually made that roughly delineate the shoreline of the ocean and large lakes/estuaries. These waterbodies are considered well resolved so the thalwegs inside them were clipped out.
The target of the RiverMeshTool are rivers that have a clear direction, so we always cut out lakes. However, even if you retain the thalwegs in the lakes, it won't do much harm for the final mesh.
I would always cut out the ocean part though.
When we created the RiverMeshTool, we already had polygons manually made that roughly delineate the shoreline of the ocean and large lakes/estuaries. These waterbodies are considered well resolved so the thalwegs inside them were clipped out.
The target of the RiverMeshTool are rivers that have a clear direction, so we always cut out lakes. However, even if you retain the thalwegs in the lakes, it won't do much harm for the final mesh.
I would always cut out the ocean part though.
That is reasonable.
As a side note for future developments: by taking into account the open water areas, the tool can be used to inform mesh generation process for important underwater features like ridges and continental slopes. These features might not be of immediate importance for physical models but could be important for local or biological applications.
Saeed also mentioned this and I agree. I'm taking a note on this for future development.
I'm thinking maybe using the location of largest bathymetry gradient to identify the edge of an underwater dredged channel or ridges.
Saeed also mentioned this and I agree. I'm taking a note on this for future development.
I'm thinking maybe using the location of largest bathymetry gradient to identify the edge of an underwater dredged channel or ridges.
That is probably the best strategy to start with.
@cuill After fixing the problem with impi, @SorooshMani-NOAA were trying to run the
pyDEM
package on a DEM in the Alaska domain which is extracted from GMRT. However, there was a problem with processing the DEM and his pyDEM script failed. We then tried another DEM that has coverage over the Cook Inlet and got another error that seems like is due to existence of missing values in the DEM (it has 'no data' value in some pixels). Here are the error messages in processing of each DEM:Cook Inlet DEM:
GMRT DEM:
These errors occur when using the sample code from pyDEM readme (run_serial.py just with updated paths for our DEMs). He tried to perform some pre-processing on both DEMs but even with fixed DEMs he got the exact same errors.
I will send you the GMRT file (too big to attach here). It would be great if you can take a look at this and see if you have a solution/fix for this.