NCAR / MPAS-Workflow

Scripts for controlling DA workflows with MPAS-Model and mpas-bundle
Apache License 2.0
20 stars 15 forks source link

Aggregate geoval and ydiags files for each processor #295

Closed ibanos90 closed 6 months ago

ibanos90 commented 6 months ago

Description

Currently, GOMsaver and YDIAGsaver filters in UFO write observation feedback files for each processor. Although the maxIODAPoolSize parameter allows us to specify the maximum number of IO pool members in IODA writer class, this only works for the obsout diagnostics so far (up to my knowledge). This PR adds the capability to aggregate geoval and ydiags files for each processor using a python script. The file containing the aggregated data is named following the same convention (i.e. <prefix>_<app>_<suffix>.nc4) so no error occurs when doing the verification steps. All this is controlled by the variable concatenateObsFeedback which can be specified under variational, hofx, or enkf key values in the scenario YAML files, default is set to False. A new CYLC task was added, which runs after the execute step if the concatenateObsFeedback was set to True.

Note that the walltime can be adjusted depending on how many ObsSpace are with geoval and ydiags files. Here some info on the timing for reference, for example, concatenating all 256 geoval files for abi_g16 takes ~20s (ydiag files take longer). Using dask improves the timing significantly, but this package is not available in the python environment we are using (my-cylc8.2), which is why I added an environment file to activate NPL.

2024-03-15 10:51:17 - INFO - Working on: geoval abi_g16
2024-03-15 10:51:37 - INFO - Working on: geoval ahi_himawari8
2024-03-15 10:51:56 - INFO - Working on: geoval amsua-cld_aqua
...
2024-03-15 10:55:39 - INFO - Working on: ydiags abi_g16
2024-03-15 10:56:11 - INFO - Working on: ydiags ahi_himawari8
2024-03-15 10:56:45 - INFO - Working on: ydiags amsua-cld_aqua
...

This PR also changes the maxIODAPoolSize to 1, so only one obsout file is written by ObsSpace.

In addition, this PR updates the path for the meanStateBuildDir for a new compilation using spack-stack 1.6.0. The previous compilation using spack-stack 1.5 was failing after the update for the develop branch.

The computational resources for the 3dhybrid-allsky scenario are also updated.

Issue closed

None

Tests completed

Tier 1:

These were tested with and without the new option. The DA part was also tested by activating the GOMsaver and YDIAGsaver filters in the YAML files.

Scenario (optional):

ibanos90 commented 6 months ago

Tested by using scenario 3dvar_OIE120km_WarmStart.yaml and added concatenateObsFeedback: True in variational section. I got one aggregated geoval file. Thanks you so much @ibanos90 for adding this function.

Thanks so much for testing it @junmeiban!