Closed FariborzDaneshvar-NOAA closed 7 months ago
@SorooshMani-NOAA
Runs for all 6 lead times of Florence completed successfully, but the post-processing steps failed with ArrayMemoryError
.
Here is one example:
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 2.35 GiB for an array with shape (468, 673957, 2) and data type float32
I will look into this later this week, to see if I can run it with dask.
Thank you for the update @FariborzDaneshvar-NOAA. Please let me know if I can assist you in any way.
As we discussed during the biweekly P-Surge meeting on Thursday, December 14th, first we will run and analyze one storm (Florence) to ensure workflow is working as expected, before running the remaining 24 storms (listed above).
For the post-processing each leadtime of a storm:
combine_results --schism --adcirc-like-output ./analyze
NHC_COLAB_2
cluster: /home/Fariborz.Daneshvar/notebooks/analyze_ensemble_dask.ipynb
Ensemble runs for hurricane Sandy 2012
completed on Hercules . See more updates here: https://github.com/saeed-moghimi-noaa/Next-generation-psurge-tasks/issues/25
I will then be running Harvey 2017
and Michael 2018
and conduct analyzes of the surrogate models before NHC meeting on Feb. 8.
Run directories for 6 leadtimes (12, 24, 36, 48, 60, and 72) on Hercules are:
/work2/noaa/nos-surge/shared/nhc_hurricanes/sandy_2012_OFCL_{leadtime}hr_korobov_30/
/work2/noaa/nos-surge/shared/nhc_hurricanes/harvey_2017_OFCL_{leadtime}hr_korobov_30/
/work2/noaa/nos-surge/shared/nhc_hurricanes/michael_2018_OFCL_{leadtime}hr_korobov_30/
Runtime summaries for hurricane Sandy on Hercules:
pschism_HERCULES_PAHM_TVD-VL
binary on 1 node: ~01:53:00pschism_HERCULES_PAHM_BLD_STANDALONE_TVD-VL
binary on 10 nodes : ~01:48:00pschism_HERCULES_PAHM_BLD_STANDALONE_TVD-VL
binary on 10 nodes (& 80 tasks per node): ~00:36:00Number of days simulated for each leadtime: 12hr
: 6.25 ; 24hr
: 6.5 ; 36hr
: 7.0 ; 48hr
: 7.5 ; 60hr
: 7.75 ; 72hr
: 7.75
Finished Sandy runs for 12,24,36,48, and 72 hr leadtimes. The meshing step of 60hr leadtime failed with PermissionError! (maybe due to an issue with the domain) @SorooshMani-NOAA is looking into it.
This is the error I got:
2024-02-09:18:04:53,827 INFO [hurricane_mesh.py:79] The mesh command is subset_n_combine.
Traceback (most recent call last):
File "/opt/conda/envs/ocsmesh/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/opt/conda/envs/ocsmesh/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/scripts/hurricane_mesh.py", line 548, in <module>
main(args, [hurrmesh_client, subset_client])
File "/scripts/hurricane_mesh.py", line 97, in main
clients_dict[cmd].run(args)
File "/opt/conda/envs/ocsmesh/lib/python3.10/site-packages/ocsmesh/cli/subset_n_combine.py", line 113, in run
self._main(
File "/opt/conda/envs/ocsmesh/lib/python3.10/site-packages/ocsmesh/cli/subset_n_combine.py", line 573, in _main
poly_clipper = self._calculate_clipping_polygon(
File "/opt/conda/envs/ocsmesh/lib/python3.10/site-packages/ocsmesh/cli/subset_n_combine.py", line 162, in _calculate_clipping_polygon
geom_cutoff = Geom(
File "/opt/conda/envs/ocsmesh/lib/python3.10/site-packages/ocsmesh/geom/geom.py", line 74, in __new__
return GeomCollector(geom, **kwargs)
File "/opt/conda/envs/ocsmesh/lib/python3.10/site-packages/ocsmesh/geom/collector.py", line 197, in __init__
in_item.clip(clip_shape)
File "/opt/conda/envs/ocsmesh/lib/python3.10/site-packages/ocsmesh/raster.py", line 1217, in clip
with self.modifying_raster(**meta_update) as dest:
File "/opt/conda/envs/ocsmesh/lib/python3.10/contextlib.py", line 135, in __enter__
return next(self.gen)
File "/opt/conda/envs/ocsmesh/lib/python3.10/site-packages/ocsmesh/raster.py", line 385, in modifying_raster
tmpfile = tempfile.NamedTemporaryFile(prefix=tmpdir)
File "/opt/conda/envs/ocsmesh/lib/python3.10/tempfile.py", line 559, in NamedTemporaryFile
file = _io.open(dir, mode, buffering=buffering,
File "/opt/conda/envs/ocsmesh/lib/python3.10/tempfile.py", line 556, in opener
fd, name = _mkstemp_inner(dir, prefix, suffix, flags, output_type)
File "/opt/conda/envs/ocsmesh/lib/python3.10/tempfile.py", line 256, in _mkstemp_inner
fd = _os.open(file, flags, 0o600)
PermissionError: [Errno 13] Permission denied: '/tmp/ocsmesh/gpon_sdn'
ERROR conda.cli.main_run:execute(49): `conda run python -m hurricane_mesh sandy 2012 subset_n_combine /work2/noaa/nos-surge/smani/data/grid/stofs3d_atl_v2.1_eval.gr3 /work2/noaa/nos-surge/smani/data/grid/WNAT_1km.14 /work2/noaa/nos-surge/shared/nhc_hurricanes/sandy_2012_ba17b72f-7589-494e-9ce3-103419600198/windswath --rasters /work2/noaa/nos-surge/smani/data/dem/GEBCO/gebco_2020_n0.0_s-90.0_w-180.0_e-90.0.tif /work2/noaa/nos-surge/smani/data/dem/GEBCO/gebco_2020_n0.0_s-90.0_w-90.0_e0.0.tif /work2/noaa/nos-surge/smani/data/dem/GEBCO/gebco_2020_n0.0_s-90.0_w0.0_e90.0.tif /work2/noaa/nos-surge/smani/data/dem/GEBCO/gebco_2020_n0.0_s-90.0_w90.0_e180.0.tif /work2/noaa/nos-surge/smani/data/dem/GEBCO/gebco_2020_n90.0_s0.0_w-180.0_e-90.0.tif /work2/noaa/nos-surge/smani/data/dem/GEBCO/gebco_2020_n90.0_s0.0_w-90.0_e0.0.tif /work2/noaa/nos-surge/smani/data/dem/GEBCO/gebco_2020_n90.0_s0.0_w0.0_e90.0.tif /work2/noaa/nos-surge/smani/data/dem/GEBCO/gebco_2020_n90.0_s0.0_w90.0_e180.0.tif --out /work2/noaa/nos-surge/shared/nhc_hurricanes/sandy_2012_ba17b72f-7589-494e-9ce3-103419600198/mesh` failed. (See above for error)
There was an issue with the leadtime jason file, so I had to re-run workflow with the updated file in input.conf
:
L_LEADTIMES_DATASET=$DATA/leadtimes.json
--> L_LEADTIMES_DATASET=$DATA/lead.json
All 6 runs with updated leadtimes of Sandy have been completed. Here are start time of new simulations (which start two days before given leadtime by NHC):
leadtime | simulation start date/time |
---|---|
12-hr | 2012-10-27 12:00:00 |
24-hr | 2012-10-27 06:00:00 |
36-hr | 2012-10-26 18:00:00 |
48-hr | 2012-10-26 06:00:00 |
60-hr | 2012-10-25 18:00:00 |
72-hr | 2012-10-25 06:00:00 |
Sandy leadtimes from NHC jason file: These are leadtimes from the file:
"2012103000": 0,
"2012102912": 12,
"2012102906": 24,
"2012102818": 36,
"2012102806": 48,
"2012102718": 60,
"2012102706": 72
@SorooshMani-NOAA I noticed that the first two leadtimes (12, and 24) are 6 hours apart! shouldn't it be 12 hours? Other leadtimes are 12 hours apart! Should we tag Laura or Andy on this issue (or post it on the other board)?
The post processing steps for Sandy did not complete. This issue will be investigated here: https://github.com/oceanmodeling/ondemand-storm-workflow/issues/43
@SorooshMani-NOAA with the new node/task setup (mentioned here), Michael runs are finishing in less than 15 min! Thanks :)
Let's do more storms ... I think I found the issue with missing name at https://github.com/oceanmodeling/ondemand-storm-workflow/issues/43.
Ensemble runs for hurricane
Sandy 2012
completed on Hercules . See more updates here: saeed-moghimi-noaa/Next-generation-psurge-tasks#25I will then be running
Harvey 2017
andMichael 2018
and conduct analyzes of the surrogate models before NHC meeting on Feb. 8.Run directories for 6 leadtimes (12, 24, 36, 48, 60, and 72) on Hercules are:
/work2/noaa/nos-surge/shared/nhc_hurricanes/sandy_2012_OFCL_{leadtime}hr_korobov_30/
/work2/noaa/nos-surge/shared/nhc_hurricanes/harvey_2017_OFCL_{leadtime}hr_korobov_30/
/work2/noaa/nos-surge/shared/nhc_hurricanes/michael_2018_OFCL_{leadtime}hr_korobov_30/
Just finished re-running all these storms with updated leadtimes, binaries, and new configurations (10 nodes & 80 tasks per node)
Completed runs for Dorian, 2019
, Delta, 2020
, and Laura, 2020
!
Post-processing step did not complete for high leadtimes of Dorian, 2019
(48hr, 60hr. and 72hr). Investigating it here https://github.com/oceanmodeling/ondemand-storm-workflow/issues/43
Workflow failed for early leadtimes of Marco 2020
, @SorooshMani-NOAA is working on it https://github.com/noaa-ocs-modeling/EnsemblePerturbation/issues/137 Thanks!
Post-processing step for some leadtimes of Irene
, Isaac
, Matthew
, Irma
, and Isaias
did not complete, so I need to run them with dask separately.
Random SCHISM runs for ike_2008 with 24 hr leadtime failed! I posted the issue here: https://github.com/oceanmodeling/ondemand-storm-workflow/issues/45
@SorooshMani-NOAA thanks for fixing Marco issue (no COOPS station in the domain). I was able to run the workflow for 12-, 24-, 36-, and 48-hr leadtimes of Marco. As we discussed before, 60-, and 72-hr leadtimes failed with this error:
2024-03-07:01:41:17,347 INFO [hurricane_data.py:107] Creating OFCL track for 60 hours before landfall forecast...
Traceback (most recent call last):
File "/opt/conda/envs/info/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/opt/conda/envs/info/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/scripts/hurricane_data.py", line 327, in <module>
main(args)
File "/scripts/hurricane_data.py", line 127, in main
forecast_start = candidates[
File "/opt/conda/envs/info/lib/python3.9/site-packages/pandas/core/indexing.py", line 1191, in __getitem__
return self._getitem_axis(maybe_callable, axis=axis)
File "/opt/conda/envs/info/lib/python3.9/site-packages/pandas/core/indexing.py", line 1752, in _getitem_axis
self._validate_integer(key, axis)
File "/opt/conda/envs/info/lib/python3.9/site-packages/pandas/core/indexing.py", line 1685, in _validate_integer
raise IndexError("single positional indexer is out-of-bounds")
IndexError: single positional indexer is out-of-bounds
ERROR conda.cli.main_run:execute(124): `conda run python -m hurricane_data
--date-range-outpath /work2/noaa/nos-surge/shared/nhc_hurricanes/marco_2020_f94800f8-e299-4bd4-bd41-d92fe82f5e54/setup/dates.csv
--track-outpath /work2/noaa/nos-surge/shared/nhc_hurricanes/marco_2020_f94800f8-e299-4bd4-bd41-d92fe82f5e54/nhc_track/hurricane-track.dat
--swath-outpath /work2/noaa/nos-surge/shared/nhc_hurricanes/marco_2020_f94800f8-e299-4bd4-bd41-d92fe82f5e54/windswath
--station-data-outpath /work2/noaa/nos-surge/shared/nhc_hurricanes/marco_2020_f94800f8-e299-4bd4-bd41-d92fe82f5e54/coops_ssh/stations.nc
--station-location-outpath /work2/noaa/nos-surge/shared/nhc_hurricanes/marco_2020_f94800f8-e299-4bd4-bd41-d92fe82f5e54/setup/stations.csv
--past-forecast
--hours-before-landfall 60
--lead-times /work2/noaa/nos-surge/smani/data/lead.json marco 2020` failed. (See above for error)
Re-ran Eta 2020
after the new fix in stormevents
.
Random SCHISM runs for ike_2008 with 24 hr leadtime failed! I posted the issue here: oceanmodeling/ondemand-storm-workflow#45
Successfully ran ike_2008 with 24-hr leadtime (with updated schism.sbatch
from Soroosh directory)
The only remaining runs are 60- and 72-hr leadtimes of Marco.
Results will be reviewed here: https://github.com/saeed-moghimi-noaa/Next-generation-psurge-tasks/issues/33
Setup and run workflow for storms and lead times as listed in https://github.com/saeed-moghimi-noaa/Next-generation-psurge-tasks/issues/14
Run directories structure on:
/work2/noaa/nos-surge/shared/nhc_hurricanes/<storm>_<track>_<lead_time>hr_<sampling_method>_<no_of_perturbations>
where:
Gustav_2008
,Ike_2008
,IreneNC_2011
,IreneNJ_2011
,Isaac_2012
,Sandy_2012
,Hermine_2016
,MatthewSC_2016
,MatthewFL_2016
,Harvey_2017
,IrmaFL_2017
,Nate_2017
,Florence_2018
,Gordon_2018
,Michael_2018
,Barry_2019
,Dorian_2019
,Beta_2020
,Cristobal_2020
,Delta_2020
,Eta_2020
,Hanna_2020
,Isaias_2020
,Laura_2020
,Marco_2020
,Sally_2020
,Zeta_2020
}OFCL
[12, 24, 36, 48, 60, 72]
korobov
30
Example run directory:
/nhc/hurricanes/florence_2018_OFCL_48hr_korobov_30/
Update: Per Laura's comment,
IrmaFL
refers toIrma
, and for bothMatthew
andIrene
that are shore parallel with two landfalls, only one run is enough (but they will be processed differently for each landfall).Checklist of completed runs with 30 ensemble for six leadtimes:
Gustav, 2008
Ike, 2008
Irene, 2011
(NC & NJ)Isaac_2012
Sandy_2012
Hermine_2016
Matthew_2016
(SC & FL)Harvey_2017
Irma_2017
(FL)Nate_2017
Florence_2018
Gordon_2018
Michael_2018
Barry_2019
Dorian_2019
Beta_2020
Cristobal_2020
Delta_2020
Eta_2020
Hanna_2020
Isaias_2020
Laura_2020
Marco_2020
,Sally_2020
Zeta_2020