dtarb / TauDEM

Terrain Analysis Using Digital Elevation Models (TauDEM) software for hydrologic terrain analysis and channel network extraction.
http://hydrology.usu.edu/taudem
Other
222 stars 115 forks source link

Fix bug in getdxdyc for parallel runs #238

Open jcphill opened 2 years ago

jcphill commented 2 years ago

linearpart::getdxdyc() would silently fail to return values for neighbor cells from other ranks, resulting in bad AreaDinf output.

dtarb commented 2 years ago

Do you have an example where this actually causes a problem. I have not investigated this specifically now, but I am skeptical because Areadinf has been tested a lot with multiple processes and ranks, and I think the approach used of having the buffer row at the bounds of each rank, and swapping after each pass likely prevents an actual error.

jcphill commented 2 years ago

Yes, in TauDEM-Test-Data/Input/Geographic running "mpiexec -np XXX AreaDinf enogeo.tif" with different rank counts gives enogeosca.tif files for which gdalcompare.py reports pixel differences (thousands of pixels but maximum difference of 2 or so).

dtarb commented 2 years ago

Thanks. I'll check it out.

jcphill commented 2 years ago

Example output (from original, unfixed version):

$ mpiexec -n 1 ../../../build/areadinf -ang enogeoang.tif -sca enogeosca1.tif AreaDinf version 5.3.9 Input file enogeoang.tif has geographic coordinate system. This run may take on the order of 1 minutes to complete. This estimate is very approximate. Run time is highly uncertain as it depends on the complexity of the input data and speed and memory of the computer. This estimate is based on our testing on a dual quad core Dell Xeon E5405 2.0GHz PC with 16GB RAM. Nodata value input to create partition from file: -340282346638528859811704183484516925440.000000 Nodata value recast to float used in partition raster: -340282346638528859811704183484516925440.000000 Processors: 1 Read time: 0.137529 Compute time: 1.248470 Write time: 0.041544 Total time: 1.427543 $ mpiexec -n 12 ../../../build/areadinf -ang enogeoang.tif -sca enogeosca12.tif AreaDinf version 5.3.9 Input file enogeoang.tif has geographic coordinate system. Nodata value input to create partition from file: -340282346638528859811704183484516925440.000000 Nodata value recast to float used in partition raster: -340282346638528859811704183484516925440.000000 This run may take on the order of 1 minutes to complete. This estimate is very approximate. Run time is highly uncertain as it depends on the complexity of the input data and speed and memory of the computer. This estimate is based on our testing on a dual quad core Dell Xeon E5405 2.0GHz PC with 16GB RAM. Processors: 12 Read time: 0.028167 Compute time: 0.140669 Write time: 0.033118 Total time: 0.201954 $ gdalcompare.py enogeosca1.tif enogeosca12.tif Files differ at the binary level. Band 1 checksum difference: Golden: 5380 New: 5431 Pixels Differing: 31527 Maximum Pixel Difference: 1.875 Differences Found: 2

jcphill commented 2 years ago

Do you have time to look at this?

dtarb commented 2 years ago

Sorry - I have not had time yet.