noaa-ocs-modeling / EnsemblePerturbation

perturbation of coupled model input over a space of input variables
https://ensembleperturbation.readthedocs.io
Creative Commons Zero v1.0 Universal
7 stars 3 forks source link

Run `ondemand-storm-workflow` for 25 storms with 6 different lead times #123

Closed FariborzDaneshvar-NOAA closed 7 months ago

FariborzDaneshvar-NOAA commented 10 months ago

Setup and run workflow for storms and lead times as listed in https://github.com/saeed-moghimi-noaa/Next-generation-psurge-tasks/issues/14

Run directories structure on:

where:

Example run directory: /nhc/hurricanes/florence_2018_OFCL_48hr_korobov_30/

Update: Per Laura's comment, IrmaFL refers to Irma, and for both Matthew and Irene that are shore parallel with two landfalls, only one run is enough (but they will be processed differently for each landfall).

Checklist of completed runs with 30 ensemble for six leadtimes:

FariborzDaneshvar-NOAA commented 10 months ago

@SorooshMani-NOAA Runs for all 6 lead times of Florence completed successfully, but the post-processing steps failed with ArrayMemoryError. Here is one example: numpy.core._exceptions._ArrayMemoryError: Unable to allocate 2.35 GiB for an array with shape (468, 673957, 2) and data type float32 I will look into this later this week, to see if I can run it with dask.

SorooshMani-NOAA commented 10 months ago

Thank you for the update @FariborzDaneshvar-NOAA. Please let me know if I can assist you in any way.

FariborzDaneshvar-NOAA commented 9 months ago

As we discussed during the biweekly P-Surge meeting on Thursday, December 14th, first we will run and analyze one storm (Florence) to ensure workflow is working as expected, before running the remaining 24 storms (listed above).

FariborzDaneshvar-NOAA commented 9 months ago

For the post-processing each leadtime of a storm:

  1. Run the following command (on a compute node) to combine results: combine_results --schism --adcirc-like-output ./analyze
  2. Follow the steps provided here to use dask and a big memory, then run analyze_ensemble.py
    Path to a notebook used to analyze ensemble on NHC_COLAB_2 cluster: /home/Fariborz.Daneshvar/notebooks/analyze_ensemble_dask.ipynb
FariborzDaneshvar-NOAA commented 8 months ago

Ensemble runs for hurricane Sandy 2012 completed on Hercules . See more updates here: https://github.com/saeed-moghimi-noaa/Next-generation-psurge-tasks/issues/25

I will then be running Harvey 2017 and Michael 2018 and conduct analyzes of the surrogate models before NHC meeting on Feb. 8.

Run directories for 6 leadtimes (12, 24, 36, 48, 60, and 72) on Hercules are:

FariborzDaneshvar-NOAA commented 8 months ago

Runtime summaries for hurricane Sandy on Hercules:

Number of days simulated for each leadtime: 12hr: 6.25 ; 24hr: 6.5 ; 36hr: 7.0 ; 48hr: 7.5 ; 60hr: 7.75 ; 72hr: 7.75

FariborzDaneshvar-NOAA commented 7 months ago

Finished Sandy runs for 12,24,36,48, and 72 hr leadtimes. The meshing step of 60hr leadtime failed with PermissionError! (maybe due to an issue with the domain) @SorooshMani-NOAA is looking into it.

This is the error I got:

2024-02-09:18:04:53,827 INFO     [hurricane_mesh.py:79] The mesh command is subset_n_combine.                           
Traceback (most recent call last):
   File "/opt/conda/envs/ocsmesh/lib/python3.10/runpy.py", line 196, in _run_module_as_main                                  
      return _run_code(code, main_globals, None,
   File "/opt/conda/envs/ocsmesh/lib/python3.10/runpy.py", line 86, in _run_code                                             
      exec(code, run_globals)                                                                                               
   File "/scripts/hurricane_mesh.py", line 548, in <module>                                                                  
      main(args, [hurrmesh_client, subset_client])                                                                          
   File "/scripts/hurricane_mesh.py", line 97, in main                                                                       
      clients_dict[cmd].run(args)                                                                                           
   File "/opt/conda/envs/ocsmesh/lib/python3.10/site-packages/ocsmesh/cli/subset_n_combine.py", line 113, in run             
      self._main(                                                                                                           
   File "/opt/conda/envs/ocsmesh/lib/python3.10/site-packages/ocsmesh/cli/subset_n_combine.py", line 573, in _main           
      poly_clipper = self._calculate_clipping_polygon(                                                                      
   File "/opt/conda/envs/ocsmesh/lib/python3.10/site-packages/ocsmesh/cli/subset_n_combine.py", line 162, in _calculate_clipping_polygon
      geom_cutoff = Geom(                                                                                                   
   File "/opt/conda/envs/ocsmesh/lib/python3.10/site-packages/ocsmesh/geom/geom.py", line 74, in __new__                     
      return GeomCollector(geom, **kwargs)                                                                                  
   File "/opt/conda/envs/ocsmesh/lib/python3.10/site-packages/ocsmesh/geom/collector.py", line 197, in __init__ 
      in_item.clip(clip_shape) 
   File "/opt/conda/envs/ocsmesh/lib/python3.10/site-packages/ocsmesh/raster.py", line 1217, in clip                         
      with self.modifying_raster(**meta_update) as dest:                                                                    
   File "/opt/conda/envs/ocsmesh/lib/python3.10/contextlib.py", line 135, in __enter__                                       
      return next(self.gen)                                                                                                 
   File "/opt/conda/envs/ocsmesh/lib/python3.10/site-packages/ocsmesh/raster.py", line 385, in modifying_raster              
      tmpfile = tempfile.NamedTemporaryFile(prefix=tmpdir)                                                                  
   File "/opt/conda/envs/ocsmesh/lib/python3.10/tempfile.py", line 559, in NamedTemporaryFile      
      file = _io.open(dir, mode, buffering=buffering,                                                                       
   File "/opt/conda/envs/ocsmesh/lib/python3.10/tempfile.py", line 556, in opener                                            
      fd, name = _mkstemp_inner(dir, prefix, suffix, flags, output_type)                                                    
   File "/opt/conda/envs/ocsmesh/lib/python3.10/tempfile.py", line 256, in _mkstemp_inner                                    
      fd = _os.open(file, flags, 0o600)                                                                                   
PermissionError: [Errno 13] Permission denied: '/tmp/ocsmesh/gpon_sdn'                                                  
ERROR conda.cli.main_run:execute(49): `conda run python -m hurricane_mesh sandy 2012 subset_n_combine /work2/noaa/nos-surge/smani/data/grid/stofs3d_atl_v2.1_eval.gr3 /work2/noaa/nos-surge/smani/data/grid/WNAT_1km.14 /work2/noaa/nos-surge/shared/nhc_hurricanes/sandy_2012_ba17b72f-7589-494e-9ce3-103419600198/windswath --rasters /work2/noaa/nos-surge/smani/data/dem/GEBCO/gebco_2020_n0.0_s-90.0_w-180.0_e-90.0.tif /work2/noaa/nos-surge/smani/data/dem/GEBCO/gebco_2020_n0.0_s-90.0_w-90.0_e0.0.tif /work2/noaa/nos-surge/smani/data/dem/GEBCO/gebco_2020_n0.0_s-90.0_w0.0_e90.0.tif /work2/noaa/nos-surge/smani/data/dem/GEBCO/gebco_2020_n0.0_s-90.0_w90.0_e180.0.tif /work2/noaa/nos-surge/smani/data/dem/GEBCO/gebco_2020_n90.0_s0.0_w-180.0_e-90.0.tif /work2/noaa/nos-surge/smani/data/dem/GEBCO/gebco_2020_n90.0_s0.0_w-90.0_e0.0.tif /work2/noaa/nos-surge/smani/data/dem/GEBCO/gebco_2020_n90.0_s0.0_w0.0_e90.0.tif /work2/noaa/nos-surge/smani/data/dem/GEBCO/gebco_2020_n90.0_s0.0_w90.0_e180.0.tif --out /work2/noaa/nos-surge/shared/nhc_hurricanes/sandy_2012_ba17b72f-7589-494e-9ce3-103419600198/mesh` failed. (See above for error)                                                                                 
FariborzDaneshvar-NOAA commented 7 months ago

There was an issue with the leadtime jason file, so I had to re-run workflow with the updated file in input.conf:

FariborzDaneshvar-NOAA commented 7 months ago

All 6 runs with updated leadtimes of Sandy have been completed. Here are start time of new simulations (which start two days before given leadtime by NHC):

leadtime simulation start date/time
12-hr 2012-10-27 12:00:00
24-hr 2012-10-27 06:00:00
36-hr 2012-10-26 18:00:00
48-hr 2012-10-26 06:00:00
60-hr 2012-10-25 18:00:00
72-hr 2012-10-25 06:00:00

Sandy leadtimes from NHC jason file: These are leadtimes from the file:

"2012103000": 0,
"2012102912": 12,
"2012102906": 24,
"2012102818": 36,
"2012102806": 48,
"2012102718": 60,
"2012102706": 72

@SorooshMani-NOAA I noticed that the first two leadtimes (12, and 24) are 6 hours apart! shouldn't it be 12 hours? Other leadtimes are 12 hours apart! Should we tag Laura or Andy on this issue (or post it on the other board)?

FariborzDaneshvar-NOAA commented 7 months ago

The post processing steps for Sandy did not complete. This issue will be investigated here: https://github.com/oceanmodeling/ondemand-storm-workflow/issues/43

FariborzDaneshvar-NOAA commented 7 months ago

@SorooshMani-NOAA with the new node/task setup (mentioned here), Michael runs are finishing in less than 15 min! Thanks :)

SorooshMani-NOAA commented 7 months ago

Let's do more storms ... I think I found the issue with missing name at https://github.com/oceanmodeling/ondemand-storm-workflow/issues/43.

FariborzDaneshvar-NOAA commented 7 months ago

Ensemble runs for hurricane Sandy 2012 completed on Hercules . See more updates here: saeed-moghimi-noaa/Next-generation-psurge-tasks#25

I will then be running Harvey 2017 and Michael 2018 and conduct analyzes of the surrogate models before NHC meeting on Feb. 8.

Run directories for 6 leadtimes (12, 24, 36, 48, 60, and 72) on Hercules are:

  • /work2/noaa/nos-surge/shared/nhc_hurricanes/sandy_2012_OFCL_{leadtime}hr_korobov_30/
  • /work2/noaa/nos-surge/shared/nhc_hurricanes/harvey_2017_OFCL_{leadtime}hr_korobov_30/
  • /work2/noaa/nos-surge/shared/nhc_hurricanes/michael_2018_OFCL_{leadtime}hr_korobov_30/

Just finished re-running all these storms with updated leadtimes, binaries, and new configurations (10 nodes & 80 tasks per node)

FariborzDaneshvar-NOAA commented 7 months ago

Completed runs for Dorian, 2019, Delta, 2020, and Laura, 2020! Post-processing step did not complete for high leadtimes of Dorian, 2019 (48hr, 60hr. and 72hr). Investigating it here https://github.com/oceanmodeling/ondemand-storm-workflow/issues/43

FariborzDaneshvar-NOAA commented 7 months ago

Workflow failed for early leadtimes of Marco 2020, @SorooshMani-NOAA is working on it https://github.com/noaa-ocs-modeling/EnsemblePerturbation/issues/137 Thanks!

FariborzDaneshvar-NOAA commented 7 months ago

Post-processing step for some leadtimes of Irene, Isaac, Matthew, Irma, and Isaias did not complete, so I need to run them with dask separately.

FariborzDaneshvar-NOAA commented 7 months ago

Random SCHISM runs for ike_2008 with 24 hr leadtime failed! I posted the issue here: https://github.com/oceanmodeling/ondemand-storm-workflow/issues/45

FariborzDaneshvar-NOAA commented 7 months ago

@SorooshMani-NOAA thanks for fixing Marco issue (no COOPS station in the domain). I was able to run the workflow for 12-, 24-, 36-, and 48-hr leadtimes of Marco. As we discussed before, 60-, and 72-hr leadtimes failed with this error:

2024-03-07:01:41:17,347 INFO     [hurricane_data.py:107] Creating OFCL track for 60 hours before landfall forecast...
Traceback (most recent call last):
  File "/opt/conda/envs/info/lib/python3.9/runpy.py", line 197, in _run_module_as_main 
    return _run_code(code, main_globals, None, 
  File "/opt/conda/envs/info/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/scripts/hurricane_data.py", line 327, in <module> 
    main(args)  
  File "/scripts/hurricane_data.py", line 127, in main 
    forecast_start = candidates[ 
  File "/opt/conda/envs/info/lib/python3.9/site-packages/pandas/core/indexing.py", line 1191, in __getitem__
    return self._getitem_axis(maybe_callable, axis=axis) 
  File "/opt/conda/envs/info/lib/python3.9/site-packages/pandas/core/indexing.py", line 1752, in _getitem_axis
    self._validate_integer(key, axis)
  File "/opt/conda/envs/info/lib/python3.9/site-packages/pandas/core/indexing.py", line 1685, in _validate_integer
    raise IndexError("single positional indexer is out-of-bounds")
IndexError: single positional indexer is out-of-bounds
ERROR conda.cli.main_run:execute(124): `conda run python -m hurricane_data 
--date-range-outpath /work2/noaa/nos-surge/shared/nhc_hurricanes/marco_2020_f94800f8-e299-4bd4-bd41-d92fe82f5e54/setup/dates.csv 
--track-outpath /work2/noaa/nos-surge/shared/nhc_hurricanes/marco_2020_f94800f8-e299-4bd4-bd41-d92fe82f5e54/nhc_track/hurricane-track.dat 
--swath-outpath /work2/noaa/nos-surge/shared/nhc_hurricanes/marco_2020_f94800f8-e299-4bd4-bd41-d92fe82f5e54/windswath 
--station-data-outpath /work2/noaa/nos-surge/shared/nhc_hurricanes/marco_2020_f94800f8-e299-4bd4-bd41-d92fe82f5e54/coops_ssh/stations.nc 
--station-location-outpath /work2/noaa/nos-surge/shared/nhc_hurricanes/marco_2020_f94800f8-e299-4bd4-bd41-d92fe82f5e54/setup/stations.csv 
--past-forecast 
--hours-before-landfall 60 
--lead-times /work2/noaa/nos-surge/smani/data/lead.json marco 2020` failed. (See above for error)
FariborzDaneshvar-NOAA commented 7 months ago

Re-ran Eta 2020 after the new fix in stormevents.

FariborzDaneshvar-NOAA commented 7 months ago

Random SCHISM runs for ike_2008 with 24 hr leadtime failed! I posted the issue here: oceanmodeling/ondemand-storm-workflow#45

Successfully ran ike_2008 with 24-hr leadtime (with updated schism.sbatch from Soroosh directory)

FariborzDaneshvar-NOAA commented 7 months ago

The only remaining runs are 60- and 72-hr leadtimes of Marco.

FariborzDaneshvar-NOAA commented 6 months ago

Results will be reviewed here: https://github.com/saeed-moghimi-noaa/Next-generation-psurge-tasks/issues/33