NOAA-OWP / inundation-mapping

Flood inundation mapping and evaluation software configured to work with U.S. National Water Model.
Other
92 stars 27 forks source link

[21pt] Mosaicing inundation produces nodata raster (threading issue?) #644

Open mluck opened 2 years ago

mluck commented 2 years ago

mosaic_inundation.py produces a raster of nodata values when number of workers is > 1. This does not affect synthesize_test_cases.py or run_test_case.py since run_test_case.py sets workers = 1.

The root of the problem seems to be a threading issue in overlapping_inundation.py coming from threaded=True being set in mosaic_inundation.py:

        if workers > 1:
            threaded = True
        else:
            threaded = False

This seems to be a known issue in mosaic_inundation.mosaic_by_unit():

# Note: This uses threading and not processes. If the number of workers is more than 
# the number of possible threads, no results will be returned. But it is usually
# pretty fast anyways. This needs to be fixed.

Deeper testing verifies that the threading issue originates from OverlappingWindowMerge.merge_rasters() in tools/gms_tools/overlapping_inundation.py:

threaded=True reproduces incorrect outputs, e.g.: overlap.merge_rasters(mosaic_output, threaded=True, workers=1, nodata=nodata) overlap.merge_rasters(mosaic_output, threaded=True, workers=4, nodata=nodata)

threaded=False produces correct outputs, e.g.: overlap.merge_rasters(mosaic_output, threaded=False, workers=1, nodata=nodata) overlap.merge_rasters(mosaic_output, threaded=False, workers=4, nodata=nodata)

It’s not clear if the error is coming from concurrent.futures.ThreadPoolExecutor, but a short-term workaround could be to force threaded=False. Also, this occurred on an AWS EC2 Ubuntu machine so it’s possible that this threading issue might be processor-specific?

Steps to replicate behavior (include URLs)

This reproduces the erroneous output because the default number of workers = 4 sets threaded=True: /foss_fim/tools/gms_tools/mosaic_inundation.py -i /data/temp/12090301_inundation_test.csv -a /data/outputs/dev-matt/12090301/wbd.gpkg -v -t inundation_rasters -m /data/temp/12090301_inundation_test.tif

However, setting workers = 1 produces the correct output: /foss_fim/tools/gms_tools/mosaic_inundation.py -i /data/temp/12090301_inundation_test.csv -a /data/outputs/dev-matt/12090301/wbd.gpkg -v -t inundation_rasters -m /data/temp/12090301_inundation_test.tif -w 1

RobHanna-NOAA commented 1 year ago

This needs to be left open. There is plans to rebuild the inundation code, but memory leaks, multi-threading and multi-processing issues are still in place. We may consider addressing it someday versus a rebuild. But good to have these notes even if we rebuild inundation.

robgpita commented 11 months ago

@mluck, in your opinion, did PR #972 resolve this issue? It seems as though the default workers has changed from 4 to 1. The functionality to provide -w 4 exists, but it is not the default. Maybe removing the multithreading option, or fixing it would fully close this issue?

mluck commented 11 months ago

I don't think #972 resolves the issue since the vulnerability still exists in mosaic_inundation.py. Removing the threaded = True from mosaic_inundation.py (or fixing it) would resolve the issue.