NCAR / DART

Data Assimilation Research Testbed
https://dart.ucar.edu/
Apache License 2.0
192 stars 143 forks source link

New interface to the Aether lat-lon model #635

Closed kdraeder closed 6 months ago

kdraeder commented 7 months ago

Description:

Add a new interface to enable assimilation with the lat-lon formulation of the Aether space weather model.

Fixes issue

Fixes #559

Types of changes

Documentation changes needed?

Tests

We compiled and built state conversion programs aether_to_dart and dart_to_aether and tested them on the latest Aether restart file sets. Model_mod_check passes tests 1-5 and 7, except for expected failure of interpolation at the poles. Filter is able to assimilate several observations of AIRS temperature and GPS profiles of electron density (even though that's not available in the restart files).

Checklist for merging

Checklist for release

Testing Datasets

kdraeder commented 7 months ago

Here's the first batch of review comments on the documentation. A couple of your NOTES and WARNINGS are not showing up in the html documentation, so definitely change those.

Thanks for catching so many mistakes! I'll work on the readme.rst and the state transform files. If I finish all that and @johnsonbk is busy with other things, I'll work on the model_mod.

johnsonbk commented 7 months ago

Thanks for everyone's work on this pull request. I ran into two problems following the directions in readme.rst.

First problem: model_mod_check test number 2 fails

Starting with a fresh clone of DART

cd <abs path to local DART installation>
git clone https://github.com/NCAR/DART.git DART
cd DART/
git fetch origin
git switch aether
cd build_templates
mv <suitable mkmf.template> mkmf.template
cd ../models/aether_lat-lon/work
sftp <user>@data-access.ucar.edu
get /glade/work/raeder/Exp/Aether/Ens_renamed_derecho.tgz
exit
tar -zxvf Ens_renamed_derecho.tgz
cp Ens_renamed_derecho/filter_input_0001.nc ./
./quickbuild.sh
# The executables compile
./model_mod_check
...
***************** RUNNING    TEST 2    ***********************
 -- Read and write restart file
**************************************************************
--------------------------------------------------------------
 Reading File : filter_input_0001.nc
--------------------------------------------------------------
 ERROR FROM:
  source : dart_time_io_mod.f90
  routine: read_model_time:
  message:  inconsistent calendar types between DART program and input file.
  message: ...  DART initialized with: NO_CALENDAR File uses: GREGORIAN
  message: ...  You may need to supply a model-specific "read_model_time()" to read the time.

Perhaps this failure is related to commit 3c7d8d?

Second problem: aether_to_dart doesn't use the aether_restart_dirname namelist entry in open_block_file

While trying to run aether_to_dart:

pwd 
<abs path to local DART installation>/DART/models/aether_lat-lon/work
vim input.nml
# Edit input.nml to set
aether_restart_dirname = 
      '<abs path to local DART installation>/DART/models/aether_lat-lon/work/Ens_renamed_derecho/'

# Continuing with the directions in readme.rst:
cd <aether_restart_dirname>
mkdir Orig
cp *m0000* Orig/

The following command to change directory back to the work directory is not in the directions in readme.rst but that's where aether_to_dart is compiled.

cd <abs path to local DART installation>/DART/models/aether_lat-lon/work
./aether_to_dart  0
...
ERROR FROM:
  source : aether_lat-lon/transform_state_mod.f90
  routine: open_block_file
  message: cannot open file grid_g0000.nc 
                                                                                                                   for read

If aether_to_dart and input.nml are in the aether work directory and the aether netcdf files are in a different directory, <aether_restart_dirname> this error is thrown.

Attempt number 1 to fix this error involves copying the grid files to the work directory

cp <aether_restart_dirname>/grid_g000?.nc ./
./aether_to_dart 0
...
ERROR FROM:
  source : aether_lat-lon/transform_state_mod.f90
  routine: open_block_file
  message: cannot open file neutrals_m0000_g0000.nc                                                                                                                                                                                                                                          for read

Attempt number 2 to fix this error involves copying the aether_to_dart executable and input.nml to <aether_restart_dirname>.

cp aether_to_dart <aether_restart_dirname>
cp input.nml <aether_restart_dirname>
cd <aether_restart_dirname>
./aether_to_dart
...
aether_to_dart Successfully converted the Aether restart files to 'filter_input_0001.nc'

 --------------------------------------
 Finished ... at YYYY MM DD HH MM SS = 
                 2024  2 14 11 23 29
 --------------------------------------

The cause of this unexpected behavior seems to be that the function open_block_file is not passed a filename including the path of <aether_restart_dirname>. The executable only works if it is copied to <aether_restart_dirname> or if <aether_restart_dirname> is the same directory as the aether work directory. The reason this behavior is unexpected is that aether_restart_dirname is set in transform_state_nml so a user would expect that aether_to_dart would actually use that namelist entry as the directory in which the restart files are stored.

hkershaw-brown commented 7 months ago

... * RUNNING TEST 2 *** -- Read and write restart file



Reading File : filter_input_0001.nc

ERROR FROM: source : dart_time_io_mod.f90 routine: read_model_time: message: inconsistent calendar types between DART program and input file. message: ... DART initialized with: NO_CALENDAR File uses: GREGORIAN message: ... You may need to supply a model-specific "read_model_time()" to read the time.



Perhaps this failure is related to commit 3c7d8d?

This is my bad, I meant force the calendar to be Gregorian rather than having a variable calder (implying a user choice). I'll go ahead and fix this.

kdraeder commented 7 months ago

@johnsonbk Thanks for working through the file transformation instructions. I'll work on that, unless you already started fixing the issues.

The cause of this unexpected behavior seems to be that the function open_block_file is not passed a filename including the path of <aether_restart_dirname>. The executable only works if it is copied to <aether_restart_dirname> or if <aether_restart_dirname> is the same directory as the aether work directory. The reason this behavior is unexpected is that aether_restart_dirname is set in transform_state_nml so a user would expect that aether_to_dart would actually use that namelist entry as the directory in which the restart files are stored.

johnsonbk commented 7 months ago

I haven't begun to start fixing the transform_state bug, @kdraeder. Please proceed.

kdraeder commented 7 months ago

Second problem: aether_to_dart doesn't use the aether_restart_dirname namelist entry in open_block_file

This brings up a strategy question, which I thought about at the beginning, but lost track of. We've decided not to develop full cycling scripting at this time, so we don't have that context to define the use of aether_to_dart, filter, and dart_to_aether. The experiment structure I'm most familiar with has the following:

  1. All of the DART executables are copied to an exec_dir.
  2. All of the ensemble model output is written to a run_dir and has a full date in the filenames. (The Aether restart file names do not have the date in them, so the scripting will need to prevent overwriting of files which need to be kept.)
  3. The assimilation script runs the executables in the run_dir using full pathnames of the executables.
  4. All of the DART output is written to the run_dir.
  5. The updated state variables are written into the existing model restart files.

An advantage of this is that the run_dir does not need to be specified in the DART executables because the scripting changes to that directory before running the executables. They only need to know file names. So far I don't see any disadvantages.

Is this the structure we want here? Or do we want the executables to be able to find the restart files wherever they are? And write their output to an output location, which may be different from the input files location?

nancycollins commented 7 months ago

the wrf folks seem to prefer putting each ensemble member's input and output in a separate subdir, probably because that makes the model advances easier to run in parallel by letting each member run in their own subdirectory without interfering with each other.

as long as it doesn't complicate the code too much, enabling flexible options for how someone wants to run both the assimilation and the model advances would be good. i know in cesm that the model advances can run in parallel using cesm options, but that may not be true for aether.

also, you probably don't want to assume that the scripting is always going to do a model advance before an assimilation, like cesm does. it's very useful for exploring parameter/namelist settings or debugging to run filter alone from a simple batch script without any other scripting requirements, as long as all the files are in the correct place with the correct names.

having said all that, it sounds like getting something running is the highest priority, so whatever works quickly might be the strategic thing here.

kdraeder commented 6 months ago

I opted to make aether_to_dart and dart_to_aether use the same directory for the Aether and filter restart files, but allow those programs to executed from any directory. I tested and pushed those commits.

mgharamti commented 6 months ago

@kdraeder I tested your changes. I am still getting a restart file even though the argument is larger than the ensemble size. I was expecting it to fail before creating a new netcdf file. And btw, we need the same logic applied to dart_to_aether. This, however, is not a deal breaker and I'm OK to move forward with the PR in its current form.

mgharamti commented 6 months ago

Great, thanks Kevin. This works in both directions now!