COSIMA / regional-mom6

Automatic generation of regional configurations of the Modular Ocean Model 6 (MOM6) in Python
https://regional-mom6.readthedocs.io/en/latest
MIT License
14 stars 9 forks source link

Setting up Regional-MOM6 on non-NCI system #79

Closed croachutas closed 4 months ago

croachutas commented 8 months ago

Hi, I'm in the process of setting up Regional-MOM6 on my workstation and working through the reanalysis forcing example as a bit of an exercise to get up to speed on it before using it as a teaching aid for one of the IMAS-OUC 2+2 programme units.

Anyway, I've thus far encountered a few issues:

  1. motu-client isn't listed as a dependency, easy to fix at my end (just pip...) but you probably should add it to the project.toml so this doesn't catch people out in future.

  2. toolpath points to MOM5 tools directory (FRE tools?) instead of tools available with MOM6 or independently: toolpath = "/home/157/ahg157/repos/mom5/src/tools/" Again, this is something I can work around (download and build MOM5... Edit: Unclear how to build FRE tools from MOM5 quickstart guide... EDIT AGAIN: Found FRE tools github page, installed apparently successfully), but I think this dependency on MOM5 needs to be clearly stated. Ideally either alternative tools available with MOM6 should be used or the necessary tools should be provided with Regional-MOM6.

  3. Running the code to download GLORY boundary forcing and initial states successfully creates get_oceanfiles.sh but when the code tries to run the shell script I get warnings to the effect of: /home/croach/anaconda3/envs/MOM6_tools/bin/python: No module named motuclient.__main__; 'motuclient' is a package and cannot be directly executed and no data is downloaded. Looking at https://github.com/clstoulouse/motu-client-python/tree/master#Usage and comparing that to the contents of get_oceanfiles.sh I suspect that rm.motu_requests is written for motu-client v1.8.0 instead of motu-client v3.x. I can probably downgrade to motu-client v1.8.0 easily enough.

Anyway, I'll have a crack at installing the MOM5 tools and downgrading motu-client and keep you updated on if that fixes my current problems.

-Chris Roach

croachutas commented 8 months ago

Okay, downgrading to motu-client v1.8.0... /home/croach/anaconda3/envs/MOM6_tools/lib/python3.11/site-packages/motuclient.py:31: DeprecationWarning: 'cgi' is deprecated and slated for removal in Python 3.13 from cgi import log 2023-11-09 13:14:46.802 [ INFO] Asynchronous mode set 2023-11-09 13:14:46.803 [ INFO] Authenticating user croach3 for service https://my.cmems-du.eu/motu-web/Motu 2023-11-09 13:14:47.189 [ERROR] Execution failed: <urlopen error [SSL] internal error (_ssl.c:1006)> 2023-11-09 13:14:47.190 [ INFO] . reason: [SSL] internal error (_ssl.c:1006)

The warning is self-explanatory and not an issue for now but might be something to keep in mind for future revision.

Then an SSL 1006 error crops up before download can start. Trying to dig into that it looks like the motuclient call tries to access a service called GLOBAL_MULTIYEAR_PHY_001_030-TDS (bolded in code snippet below, user account and password deleted from snippet but present in file): python -m motuclient --motu https://my.cmems-du.eu/motu-web/Motu **--service-id GLOBAL_MULTIYEAR_PHY_001_030-TDS** --product-id cmems_mod_glo_phy_my_0.083_P1D-m --longitude-min 149.7 --longitude-max 150.3 --latitude-min -45 --latitude-max -40 --date-min 2003-01-01 00:00:00 --date-max 2003-01-05 00:00:00 --depth-min 0.49 --depth-max 6000 --variable so --variable thetao --variable vo --variable zos --variable uo --out-dir /home/croach/MOM6/MOM6_Tas_test_case/scratch/regional_tmp/tasmania-example-reanalysis --out-name east_unprocesse

But manually opening https://my.cmems-du.eu/motu-web/Motu doesn't show such a service as being available: Screenshot at 2023-11-09 13-27-32 Two obvious possibilities here, either CMEMS has changed their service names (hence, code needs updating to reflect that) or I need to request access to that particular service. I'll check if it's the latter and get back to you.

-Chris

croachutas commented 8 months ago

Getting the same error/outcome when using the API request motu-client code snippet on https://data.marine.copernicus.eu/product/GLOBAL_MULTIYEAR_PHY_001_030/download?dataset=cmems_mod_glo_phy_my_0.083_P1D-m_202112, and downloading data manually seems to also not be working... So, I suspect that's a problem at CMEMS's end...

croachutas commented 8 months ago

Okay, contacted CMEMS and their advice that access via MOTU is no longer stable they would encourage a move to their new Copernicus Marine Client:

We apologize for the issues that MOTU has been experiencing lately. This service is no longer optimal at the present time. That's why we are currently working on a new method of data downloading: the Copernicus Marine Client.

In a nutshell, here are the main advantages of this new Client:

a simple and robust CLI tool for downloading data
an evolutive and powerful Python API for remote access
a more stable and faster subsetting service
absolutely no quota, neither on bandwidth nor data size limit

You will find in the Copernicus Marine Client section, many articles concerning its installation and use in order to download products from your command terminal (connected to your python environment) or from your notebook.

Currently, the latest version of CMC is 0.9.11 but I advise you to install the more stable 0.9.8 version.

To do so, you can directly create your environment with this version using the attached .yml file instead of the one provided in the article.

navidcy commented 8 months ago

cc @ashjbarnes

ashjbarnes commented 8 months ago

Hi @croachutas so sorry for the late reply! I'd messed up my Github notification settings so didn't see until Navid tagged me.

  1. I think we had this discussion about whether to add this or not, but since it's part of an example rather than the core package I think we didn't. @angus-g @navidcy maybe this is worth revisiting to improve user experience? Although this may now be replaced with a different package to reflect the changes with CMEMS...

  2. My understanding is that FRE tools belong to the general family of models using GFDL's Flexible Modelling System (FMS) so isn't strictly a MOM5 dependency. You're right though in that I was lazy and just used a precompiled set of tools which did come from a MOM5 repo originally

  3. Thankyou so much for following up on this with CMEMS! That's really good to know, and a shame. I guess this messes up our non-NCI example pretty comprehensively. We should look for an alternative or try to use the CMC client (I'll open an issue for this)

I'll make this a priority, since I agree it's really important to be able to onboard new users with an easy example, even if the downloading of data does sit a bit outside of the scope of the package. (since we expect end users to BYO whatever ocean forcing data they want once they've learned how to use the package)

If I can help further we can continue the conversation here, or have a zoom call to speed things along :)

ashjbarnes commented 8 months ago

In PR #83 I've removed references to NCI and replaced the Motu client with instructions on how to use the GUI to reproduce the same results. This will be annoying as you'll need to download locally then upload them to your HPC but will have to do for now until we implement their new data API

croachutas commented 6 months ago

Hey,

Edit: Below is based upon grabbing an updated version fo the jupyter notebook but without pulling down the full declutter_notebook branch.

I've finally got back onto this. Downloaded the updated notebook, copied ERA5 from NCI to my workstation and pulled down two months of GLORYS.

So, status at the moment is things run up to step 6 (running FRE tools) where I get the following error:


TypeError Traceback (most recent call last) Cell In[18], line 1 ----> 1 expt.FRE_tools((4,4))

File ~/anaconda3/envs/MOM6_tools/lib/python3.11/site-packages/regional_mom6/regional_mom6.py:1201, in experiment.FRE_tools(self, layout) 1195 for p in self.mom_input_dir.glob("mask_table*"): 1196 p.unlink() 1198 print( 1199 "MAKE SOLO MOSAIC", 1200 subprocess.run( -> 1201 self.toolpath 1202 + "make_solo_mosaic/make_solo_mosaic --num_tiles 1 --dir . --mosaic_name ocean_mosaic --tile_file hgrid.nc", 1203 shell=True, 1204 cwd=self.mom_input_dir, 1205 ), 1206 sep="\n\n", 1207 ) 1209 print( 1210 "QUICK MOSAIC", 1211 subprocess.run( (...) 1217 sep="\n\n", 1218 ) 1220 print( 1221 "CHECK MASK", 1222 subprocess.run( (...) 1227 ), 1228 )

TypeError: unsupported operand type(s) for +: 'PosixPath' and 'str

So, what in the scheme of things looks like a fairly minor problem: Combining a string with a PosixPath should use / rather than + when adding commandline buhmp to the tool path before running a subprocess.

So, question is, is that something that needs to be changed in regional_mom6.py, or in line with the older version of the notebook do I need to define the various paths as strings rather than PosixPaths? (Or is it something that's already been fixed but I'll need to update my build of regional-mom6?)

Edit: Switching to strings in place of PosixPaths worked fine.

croachutas commented 6 months ago

And witha bit of copy pasting from the older version of the notebook got the full setup done. Now to run a first test...

croachutas commented 6 months ago

Okay, moved over fully to the declutter_notebook branch. Forcing prep runs properly, had to edit MOM_input to specify correct number of layers (NK, was defaulting to 100, reduced it to 30) and diag_table to insert experiment name and start date.

Had some initial problems with running via mpirun, but that looks to have been a bugger up on my behalf (running within a conda environment, so just deactivated conda...) but that's resolved.

Issues with errors to do with "Invalid variable type in NetCDF file, turns otu i needed to remove the time variable from the initial condition files (Done).

Now having issues finding /INPUT/forcing/tu_001.nc etc. which appears to be related to tides.... Commented them out of the OBC_SEGMENT_00[1-4]_DATA file paths specified in MOM_input and set tides to false everywhere I could finda reference to tides, and now having problems with errors like "FATAL from PE 1: Values needed for OBC segment"

ashjbarnes commented 6 months ago

Ohh thanks yeah I'd accidentally left tides in from the run I was using to troubleshoot. If you set

OBC_TIDE_N_CONSTITUENTS = 0

and remove the tidal files from the obc segment files so they all look like:

OBC_SEGMENT_001_DATA = "U=file:forcing/forcing_obc_segment_001.nc(u),V=file:forcing/forcing_obc_segment_001.nc(v),SSH=file:forcing/forcing_obc_segment_001.nc(eta),TEMP=file:forcing/forcing_obc_segment_001.nc(temp),SALT=file:forcing/forcing_obc_segment_001.nc(salt)

that will hopefully do it

ashjbarnes commented 6 months ago

For reference here's a MOM_input file for a no tide run

croachutas commented 6 months ago

Thanks, that gets me past that issue but now I'm getting errors about "MOM_diag_remap, initialize_regridding: Specified file not found: Looking for 'INPUT/diag_rho2.nc' (FILE:diag_rho2.nc,interfaces=rho2)"

It looks like either diag_rho2.nc isn't being generated when the forcing fields are created or needs to be provided "pre-baked".

croachutas commented 6 months ago

Switched to using DIAG_COORD_DEF_RHO2 = "WOA09" in MOM6_input, might not be ideal but moves me onward for now. Now getting an error about " Unable to find variable DAYMAX in any input files."

Edit: Adding DAYMAX following other MOM6 examples sees the model start running but immediately crash because it looks for forcing from outside the time-period provided (defaults to date of 0001-01-01 vs date range on forcing data of 2013-01-01 to 2013-01-10). Comments in MOM6_input in examples suggest it should be possible to set end date via "ocean_solo_nml in input.nml" but I can't seem to find examples that actually do this.

Edit 2: As far as I understand this should be read from &coupler_nml in input,nml but isn't for some reason.

Edit 3: I built MOM6 following the instructions at https://github.com/mom-ocean/MOM6/blob/main/ac/README.md which seem to be for an ocean only installation... Looks like the setup shown here is instead built with coupling but then uses the coupling code to just feed in surface properties. Will try rebuilding MOM6 following the instructions for allowing coupling at https://github.com/NOAA-GFDL/MOM6-examples/wiki/Getting-started#compiling-the-models

ashjbarnes commented 6 months ago

Oh yeah this would be the problem! Ocean only has no atmospheric forcing and so there's no coupler. The input.nml is then configured differently for ocean only. I imagine there's a way to modify both the MOM_input and input.nml files to get the formatting right for ocean only if you really don't want atmospheric forcing.

To be able to use the provided input configurations from the demos you'll need an ocean-sis executable. You can use my executable that's ready to go on gadi for testing purposes if you like. It's found at:

/g/data/v45/ab8992/mom_executables/dec-23-build

ashjbarnes commented 6 months ago

Thanks, that gets me past that issue but now I'm getting errors about "MOM_diag_remap, initialize_regridding: Specified file not found: Looking for 'INPUT/diag_rho2.nc' (FILE:diag_rho2.nc,interfaces=rho2)"

It looks like either diag_rho2.nc isn't being generated when the forcing fields are created or needs to be provided "pre-baked".

Thanks good feedback - this probably shouldn't be included actually as it's a diagnostic I've been using for my research. I'll remove it from the demos

croachutas commented 6 months ago

Thanks Ash, I built a couple version following the instructions above. Well, the models starts running but partway through setup I get `FATAL from PE 4: compute_qs: saturation vapor pressure table overflow, nbad= 210

Bad temperatures (dimension 1): 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1`

Which per chasing up in other peoples' error reports suggests something is wonky with atmospheric temperature (temp outside range of the lookup tables used to compute .saturation vapour pressure). Checked the forcing file and it gave 2m temp in degK, thought that might be it, so manually converted the file to degC and still getting same error.

Any ideas?

ashjbarnes commented 6 months ago

My apologies Chris. I'll prioritise this and hopefully figure things out this week! I'm now hitting these kinds of errors with my own research too. A couple of things have changed since things were working smoothly with ERA5 in December, notably the mom executable had been out of date so there are possibly some compatibility issues if they patched anything to do with surface forcing.

Perhaps in the meantime you could either:

I'll let you know as soon as it's fixed

ashjbarnes commented 6 months ago

Ok Mystery solved.

To use ERA5, you need the latest version of FMS as they've updated the surface forcing functionality. Unfortunately, the current MOM_examples uses an FMS version from 3 years ago. My runs were working with ERA5 using an executable that @angus-g had sent me a while ago, but I'd recently recompiled mom6 to include their latest boundary condition improvements, and my FMS version reverted back!

I've since fixed the bugs you found on the declutter notebook branch, and I can confirm that with an updated FMS executable the reanalysis_forced notebook does work, and for the default FMS version you get from MOM_examples I get the same mom6 errors as you

croachutas commented 6 months ago

Thanks. I'll try pulling down and building the latest version of FMS.

ashjbarnes commented 6 months ago

Angus has a branch of his ninja compile scripts ready to go for the latest FMS and code.

On Gadi I've used it already to create a new executable you could use to test. It's found here: /g/data/v45/ab8992/mom_executables/jan2024-latest-everything

croachutas commented 6 months ago

Thanks again. Downloaded the latest version of FMS and will try building soon. Looking at install instructions and looks like it'll mostly be a matter of changing some paths for some links (hopefully not resulting in things exploding).

croachutas commented 6 months ago

Looks like I've got it working (well, running and not immediately exploding at least). Turns out MOM6_examples provides two versions of FMS (in src/FMS1, 2019ish, and src/FMS2, 2022ish, respectively). By default it builds from src/FMS1... which doesn't work with the coupled model. But changing that to src/FMS2 and wham it's suddenly good.

Doing a longer run now to make sure the output isn't doing anything too weird...

(I did try building with a new download from the FMS repo but hit errors when compiling the ice-ocean model).

ashjbarnes commented 6 months ago

Interesting! We found we also needed to update the FMScoupler to get it to work. MOM_examples had an older version of this that didn't support surface fluxes at different heights as required by ERA5 forcing

ashjbarnes commented 5 months ago

@croachutas There were issues with the era5 forcing setup: I'd not included the solar fluxes! If you pull down the latest code now (it's all merged into main) then the setup_era5 function now handles the surface fluxes, and the data_table in the era5 premade directory will pass these forcing files onto the coupler

croachutas commented 5 months ago

Thanks! Will test later today.

croachutas commented 5 months ago

Okay... Usual minor issues (lack of start date and experiment name in diag_table; need to turn off tides) which can be fixed with a few seconds in relevant config files. Link to input director INPUT no created (just the links names inputdir which always breaks), but that's another easy fix.

And the bigger issue, I'm getting a crash with an error "FATAL from PE 1: fms_io_mod(field_size): file INPUT/forcing/tu_001.nc and corresponding distributed file are not found" Those files seem to be tidal forcing at boundaries... Should be able to remove them from MOM_input without any problems...

croachutas commented 5 months ago

Then missing diag_rho2.nc, in the past replaced that with WOA09 keyword in MOM_input... Done. Would suggest either providing this file for the examples or changing this in the default MOM_input.

Now, error because it wants the various forcing files to not have the .nc suffix... Another easy fix at my end, either need to rename the files or change entry in data_table file. Again, something that probably should be changed for the default setup.

croachutas commented 5 months ago

And running now... Three days done and yet to explode, but does seem slower than before you added the surface fluxes (not stupidly slow but may mean I tell the students to do a 5 day run rather than a 10 day run)

ashjbarnes commented 5 months ago

Hmm are you sure you're on the latest version?

The diag table and tide issues were fixed. At least if you look at the MOM_input file in the premade run directory the diag table and tides have been removed. The data table in the current version also expects the .nc to be at the end of surface forcing as well

croachutas commented 5 months ago

Okay, I see the problem: When installing MOM6_regional to a conda environment I used pip git+https://github.com/COSIMA/regional-mom6.git... Which set up the package in the wrong place, forcing me to manually create the demos folder and subfolders which did not update when I updated the package...

Will uninstall and try rebuild using conda-build.

ashjbarnes commented 5 months ago

ok good to know... @angus-g might be able to help. Perhaps the original build instructions need updating with the new demos?

Oh and regarding the creation of the INPUT folder, that's carrying over from when people use payu whichwhich creates a temporary work directory. In here there's an INPUT folder with symlinks to every forcing file.

It's a good point that we should handle this for non-payu users. Perhaps using the inputdir provided by the user in the notebooks as you say

croachutas commented 5 months ago

Well, tis working. But using a southern Tasmania test case I'm getting some weird cooling in the coastal margins. Day 50 from a Jan-Feb 2013 run below as an example: MOM6_test_run_PT

Some of this could be due to narrow channels being missed by the current 2km test case (D'Entrecasteaux Channel and Fredrick-Henery Bay/Norfolk Bay round the Tasman Peninsular both get cutoff from the ocean). Others areas don't seem to have an obvious explanation... Could it just be running into limits of surface forcing resolution?

(In some ways this isn't necessarily a problem that needs fixing... Since this setup is intended as a teaching tool asking the students to comment on what they think the model got wrong could be very useful).

Will also run a test or two in the open ocean (Southern Ocean Flux Station and surrounds?) and on some less complicated coastline...

ashjbarnes commented 5 months ago

Check your surface fluxes. There was an older version that accidentally set the solar radiation to zero (misunderstanding of how data override works). The data table should looks like this for the radiative fluxes: image

croachutas commented 5 months ago

Thanks Ash, that seems to have mostly solved the cooling on the seafloor but doesn't look to have helped the cooling of coastal SST as much.

ashjbarnes commented 5 months ago

Yeah that's really strange! I haven't noticed it in my runs which use ERA5, SIS/MOM6 at latest source and Glorys at the boundary. Is your executable fairly up to date? There were some updates to the coupler that helped ERA5 perform more sensibly

croachutas commented 5 months ago

Nope, I'm still using mostly the defaults from the mom6-examples build... Looks like that points to a rather old version of the coupler (coupler @ 14578f0, circa 6 years old). I'm afraid if I try upgrading the coupler it'll break the build scripts as happened when I tried installing more recent versions of FMS.

croachutas commented 5 months ago

(Might need to see if I can get the build using ninja to work... Any idea if that codebase works outside NCI?)

angus-g commented 5 months ago

Any idea if that codebase works outside NCI?

There's nothing really NCI-specific in it, but you might have to tweak paths. As long as you have working Fortran and C compilers, an MPI implementation and the NetCDF libraries (for C and Fortran), you should be able to get it working.

navidcy commented 5 months ago

Question: should we convert this into a Discussion? If it's not related with a bug/issue in the code I think it fits better into the Discussion section.