C grid gx3 tests: Kimmritz et al. 2016 C1 configuration

JFLemieux73 commented 2 years ago

I have bad news...I just ran gx3 for one year with B grid EVP, B grid VP and C grid EVP.

I used this commit: bffe4e8797f4d166d179df8df8ccc44a42295b13

ndte=1200 visc_method = 'avg_zeta' elasticDamp = 0.12d0 deltaminEVP = 2e-9 deltaminVP = 2e-9 capping = 0.

Here is the thickness diff (Bvp-Bevp):

Bvp_B

They are very close (range: -0.1 to 0.1). Here is now Cevp-Bevp. Notice that the range is now -1 to 1 m.

C_B

That's not cool!

JFLemieux73 commented 2 years ago

I don't think it is related to elastic waves. Is it possible the issue is related to advection or ridging (maybe the deformation subroutine has a bug?).

I am going to use boxwallblockp5 for debugging.

JFLemieux73 commented 2 years ago

Here are a few things I would like to test/investigate: -thermo on or off -test new deformation_T2 subroutine -test kstrength = 0 or 1 -test remap and upwind -issue with metric terms?

dabail10 commented 2 years ago

I think that adding the viscosities to the history might not be worth it. The B-grid values are just local to the subroutine stress. I think with the gbox12 or even gbox80, you should just print these. I can try a test where I print them for a B and C grid. I did this awhile back and I remember that the issue was the values of these where you had a a jump from ice to no ice. This was improved when using the avg_strength method. Keep in mind for the B-grid, you are never averaging the viscous coefficients and you only use the strength at the T-points.

JFLemieux73 commented 2 years ago

Thanks Dave. For the moment I don't need to look at viscosities. After my tests with boxwallp5 I don,t think there is an issue with viscosites. We will see...

JFLemieux73 commented 2 years ago

I ran only one month of gx3 and I see the same problem. I look at monthly means below (January 2005). The max diff hi (B vp) - hi (B evp) is only around 1 cm. Here is the difference between hi (C evp) and hi (B evp):

Cevp_Bevp_hi_2005-01

eclare108213 commented 2 years ago

I suggest trying a run with thermo and advection and ridging all off, to compare divu. If that's okay, then turn ridging on and compare again.

JFLemieux73 commented 2 years ago

There are large differences. In the first test I just change deform_option to 2 (new method in my deformT branch). Good idea Elizabeth. The new deform method does not solve the problem. The difference between option 1 and 2 is around 1 cm but the diff between C evp (option 2) and B evp is similar to the fig above.

JFLemieux73 commented 2 years ago

Hello, Questions for @eclare108213 @apcraig @dabail10 : -has anyone looked at the new grid lengths (e.g. dxE) for the gx3 test case. Do they make sense? -Am I right that the B grid starts from a complete restart but that the C grid does not? I don't think it would explain the big differences in thickness but it could help me for debugging.

JFLemieux73 commented 2 years ago

Follow up question. Let's say I want to have the same initial conditions. What would be the best way to set the velocities and stresses to zero at the beginning? In dyn _prep2 but only done for the first time level?

eclare108213 commented 2 years ago

Good morning @JFLemieux73. Can you post a figure that shows dxE etc that look odd to you? I haven't looked at those yet.

You can set the B- and C -rid test cases with gx3 to have the same initial conditions using ice_in -- you shouldn't need to go into the code itself to do that. There is probably also a set_nml option to do that -- I'll have to look to see what they are. gx3 does usually start from a B-grid restart file, but you can use no-ice or default (slab) conditions instead for both grids.

And those latter options start from rest.

JFLemieux73 commented 2 years ago

Thanks Elizabeth. I have not looked at dxE or other grid lengths. As our tests are good on uniform grids I am wondering if the weird results on the gx3 grid can be due to these terms.

dabail10 commented 2 years ago

Both the B and C grids start from the same gx3 restart. The Cgrid velocities and stresses are initialized to zero though. This would account from some initial differences in the first couple of days.

Dave

JFLemieux73 commented 2 years ago

Thanks Dave. I have another question. As Elizabeth suggested, I tried to run without advection, ridging and thermo. I set ktherm=kridge=ktransport = -1

The problem is that uvelE=vvelN=0 during all the simulation....basically nothing happens. I guess tauair is not calculated. Any idea what I could do?

apcraig commented 2 years ago

I've been following progress. A few thoughts. Do all the problems in the gx3 grid start next to the land boundary? It looks like the box cases are not showing the same issues. The box cases (except the kmtislands) have very limited land geometry. For instance, the land is all constant in i or j or are a corner. The gx3 will have the opposite of a corner (a "peninsula") which is not in the box cases.

We are masking the computation of uvelN, vvelE, uvel, vvel after we grid_average_X2Y. Is this working properly? We should check in the gx3 case whether the B and C uvel and vvel fields have values of zero at exactly the same locations. I believe they should.

We impose the BC in the C grid solver via an implementation that imposes a mirror image inside the boundary. Is this not working correctly in non-uniform grids? We should check. Also, do we need to leverage that information somewhere else where we're not. I would say we should leverage it in the uvel, vvel calculation, but I think the masking is doing it via a different method.

Finally, the grid parameters can be written out to the history file with f_dxe, f_dye, f_dxn, f_dyn, f_nmask, f_emask, etc. We should probably add that to the default namelist file.

dabail10 commented 2 years ago

That makes sense about the wind stresses and the fluxes. The subroutine icepack_atm_boundary which is called within icepack_step_therm1. This is called from step_therm1 in CICE. I believe step_therm1 is not called when ktherm = -1. So, we need to bring the call to icepack_atm_boundary up or move the if condition on ktherm into icepack.

eclare108213 commented 2 years ago

I don't recommend refactoring the logic in the code for this test. It's easier to just set a non-physical wind stress as a default (e.g. constant, u=v diagonal), since this isn't intended to be a realistic run anyhow - you're just testing the velocity and stress calculations with complicated boundaries. And that's what the kmtislands land mask was intended to do...

JFLemieux73 commented 2 years ago

I added some more diagnostics for debugging. This is the Arctic volume for the first month (ktherm=2, ktransport=kridge=1):

volume_C-grid_B-grid

And the Arctic RMS ice speed:

RMS_speed_C-grid_B-grid

So it looks like it is a bit faster with the C-grid. I guess this leads to more deformations and therefore more growth.

JFLemieux73 commented 2 years ago

I have replaced the initial condition to ice_ic = 'default'. The same things happen: thicker and slightly faster for the C grid. Here is the volume for one month:

volume_C-grid_B-grid_slab

JFLemieux73 commented 2 years ago

Here is the slab experiment repeated but I used kstrength=0 (Hibler) with P*=0.0 (no rheology). Here is the volume for both B and C grids:

volume_C-grid_B-gridslab_P0

eclare108213 commented 2 years ago

I remember reading a paper a while back, maybe by Martin Losch??? which characterized differences in (E)VP discretization in terms of an apparent ice strength. Do you remember that? Maybe you were an author, @JFLemieux73! I don't remember whether the comparison was B vs C for EVP, or if it was VP vs EVP.... If we rule out other potential causes like varying grid lengths, then perhaps that is the reason. But we need to rule out other things first.

JFLemieux73 commented 2 years ago

Do you mean: On solving the momentum equations of dynamic sea ice models with implicit solvers and the elastic–viscous–plastic technique?

This is Losch and Danilov 2012.

eclare108213 commented 2 years ago

Sounds right!

JFLemieux73 commented 2 years ago

Here is the same exp as above but with Pstar=2.75e4 (default).

volume_C-grid_B-gridslab_Pdef

JFLemieux73 commented 2 years ago

Hum...this is interesting. This is slab, Pstar=2.75e4 but with upwind advection:

volume_C-grid_B-gridslab_Pdef_upwind

dabail10 commented 2 years ago

This would seem to indicate a problem with the interpolation of the C-grid velocities to the B-grid.

JFLemieux73 commented 2 years ago

I will use the gx3 restart and run for one year with upwind just to make sure.

apcraig commented 2 years ago

I am running a longer B and C case now. I have confirmed that the uvel and vvel zero values on the B and C grid are at identical locations. So that's good. I have also looked at the grid values on the T, U, N, and E and they look fine too. U and T are not particularly consistent (see https://github.com/CICE-Consortium/CICE/issues/653), but otherwise, E and N are consistent with U and T.

Is the upwind scheme not using the corner velocities? If the corner velocities on the C grid are not correct, why would the upwind work but not the remap? Maybe we're losing accuracy in the velocity gradients or something? Does that play a role in the remap routine but not upwind?

JFLemieux73 commented 2 years ago

I guess the upwind uses directly uvelE and vvelN for the Cgrid. uvel and vvel need to be interpolated for the Bgrid to use upwind.

JFLemieux73 commented 2 years ago

The total volumes (B and C) with upwind are very similar (last graph above). The hi fields have differences though. Here is for the B grid:

Bslab_Pdef_upwind_hi_01

and C grid:

Cslab_Pdef_upwind_hi_01

JFLemieux73 commented 2 years ago

And here is the difference: Cgrid - Bgrid

Cslab_Pdef_upwind_Bslab_Pdef_upwind_hi_2005-01

JFLemieux73 commented 2 years ago

And it still seems to be faster with the C grid:

RMS_speed_C-grid_B-gridslab_Pdef_upwind

apcraig commented 2 years ago

I attach 4 images each for hi, uvel, and vvel. These are 1 month average after the start of the long run. I am using the gx3 defaults and manually setting

>     ndte            = 1200
>     kstrength       = 0
>     visc_method     = 'avg_zeta'
>     elasticDamp     = 0.12d0
>     deltaminEVP     = 2e-9
>     capping         = 0.
>     maxits_nonlin   = 30

The 4 images are the B, C, absolute diff, and relative diff or each of the three fields. My thoughts

The results are consistent with @JFLemieux73's above for 1 month
In some places where the hi absolute diff is larger, the relative diff is smallish, these might internal variability?
Where the hi absolute and relative diffs are large, they seem to on the edge of land where the velocities are small. A small absolute change in the velocity there seems to be making a relatively big difference in the hi. This would be expected I think. I think the question is whether those small absolute differences (large relative differences) of velocity are OK or not.

I'm not saying everything is OK and am interested in the fundamental differences in upwind and remap, what is one working better than the other.

hi:

hi_B hi_C hi_adif hi_rdif

uvel:

uvel_B uvel_C uvel_adif uvel_rdif

vvel:

vvel_B vvel_C vvel_adif vvel_rdif

apcraig commented 2 years ago

The remap uses corner velocity in B and C? The upwind uses corner in B and E/N in C?

Does the upwind average the B corners to E/N?

JFLemieux73 commented 2 years ago

The remap was implemented for the B grid. C grid velocities are interpolated to corners for remap.

Upwind is naturally suited for C grid. C grid velocities are directly used while B grid velocities are interpolated to E and N points. You can see that in subroutine transport_upwind.

dabail10 commented 2 years ago

Right, but there is interpolation in either case. B grid is interpolated to the C points for upwind and C grid is interpolated to the B grid for the remapping.

apcraig commented 2 years ago

I ran three 5 year tests. Bvp, Bevp, Cevp. The Cevp case hangs after 1.1 years, so that's a problem. Not sure what's going on. Below I plot NH and SH ice volume, ice area, and max volume (max thickness?) for the 3 cases as well as the deltas relative to Bvp. The Bvp and Bevp are very similar, it's really quite surprising (to me). The Cevp is not as close, but it's also not different climate. I wish we could run it more than a year. I'll try to look into that. The max volume is the thing that's the most concerning, although it's not running totally out of control.

Volume: volumeNH dvolumeNH volumeSH dvolumeSH

Max Volume:

maxvolNH dmaxvolNH maxvolSH dmaxvolSH

Area:

areaNH dareaNH areaSH dareaSH

Speed:

speedNH dspeedNH speedSH dspeedSH

JFLemieux73 commented 2 years ago

Like Tony I am more and more thinking that everything is ok. I ran gx3 (ice_ic='default') for B and C grid evp. Same settings as above (capping=0, avg_zeta, etc.). Here is uvel for the B grid:

Bslab_gx3_uvel_1_01-43200

I also show uvel (not uvelE) for the C grid (same thing as for the B grid):

Cslab_gx3_uvel_1_01-43200

JFLemieux73 commented 2 years ago

The vvel fields are also very similar. Here is divu (collocated for B and C).

B grid: Bslab_gx3_divu_1_01-43200

C grid: Cslab_gx3_divu_1_01-43200

JFLemieux73 commented 2 years ago

For some reasons we see values of divu in weird places (e.g. Africa, Mediterranean Sea) for both B and C grids.

Here are also some fields related to rheology. Note that they are not collocated.

B grid sigP:

Bslab_gx3_sigP_1_01-43200

C grid sigP:

Cslab_gx3_sigP_1_01-43200

B grid strintx:

Bslab_gx3_strintx_1_01-43200

C grid strinxE:

Cslab_gx3_strintxE_1_01-43200

JFLemieux73 commented 2 years ago

@apcraig do you know what caused the C grid run to stop? Did it crash due to a CFL problem for example?

JFLemieux73 commented 2 years ago

@apcraig you mentioned above that in our simple test cases we don't test things like a peninsula. I think it might be a good idea to test that. We could have a box with a peninsula in the middle of the eastern wall. We could initialize half (eastern part) with for example 0.5 (or higher) concentration and use constant winds (maybe northeast).

apcraig commented 2 years ago

I'm still debugging the C grid issue. It just hangs without any other information. I have a case with the debug flags turned on that I'm running now. Hopefully, that will run fast enough, but I may have to try other things.

With regard to peninsula's. The kmt islands case covers all those land mask cases. I think we should be able to leverage that configuration for testing.

eclare108213 commented 2 years ago

Feel free to simplify the land mask in kmtislands. I put everything in there that I imagined might be pertinent, and it's probably overkill.

apcraig commented 2 years ago

I can run the C grid case one year at a time, just not 5 years. That suggests it's a memory leak or something like it. I will add a memory diagnostics and see whether that turns something up.

JFLemieux73 commented 2 years ago

I was curious and also ran it myself. I ran it for 2 years (not one year at a time) and it worked.

Can a memory leak partly explain the slowness of the C grid code?

apcraig commented 2 years ago

Memory leaks could slow the code down, but probably not much. Not surprising that the error is hit (or missed) differently on different machine. Just confirms a technical issue more than a science issue. Working on it now.

apcraig commented 2 years ago

I have updated the time series plots above, https://github.com/CICE-Consortium/CICE/issues/702#issuecomment-1089342997. Cevp has now run 5 years. Results look pretty reasonable. SH looks quite fine. NH pretty good. I think the biggest issue is the max volume (thickness?) which is much higher in the Cevp (12m vs 6m). I assume this is one point, but will make some 2d plots. If we want other plots, let me know.

I have also run several tests to try to understand why the Cevp case hangs after 1.1 years.

An identical case with debug flags hung at the same spot with no additional information.
The model hangs quite consistent on Jan 18 of the second year.
A gx3 Cgrid case with default namelist runs 2 years fine. The cases we're testing here have ndte=1200 and other things changed.
The model runs fine 1 year at a time.
I have instrumented CICE with memory diagnostics (which I will PR) and there is nothing unusual happening with those.
OpenMP is off in all these tests

It's unclear what's happening at this point. It's unusual for the model to just hang that way, especially with debug flags on. It's also interesting that @JFLemieux73 can run fine for 2 years. I think it will take a bit of a debugging effort to get to the bottom of it. I guess I would advocate for creating an issue and not holding things up at this point. This will require "long runs" to debug. The next steps may be to

try to run the default gx3 cgrid 5 years and see if it hangs at some point
test various namelist changes to see if we can identify one that might be causing problems.
??

apcraig commented 2 years ago

Here are 2d plots, monthly average, Dec 2009, 5 years into the runs. Surprisingly close. Either the B and C implementations produce really similar solutions or the forcing so constrains the solutions that they have to produce this result (i.e. very little internal model variability). Plots are for Bevp, Cevp, and the difference for 4 fields, aice, hi, uvel, vvel.

aice:

aice_B aice_C aice_diff

hi:

hi_B hi_C hi_diff

uvel:

uvel_B uvel_C uvel_diff

vvel:

vvel_B vvel_C vvel_diff

apcraig commented 2 years ago

The main problem seems to be the grid point that has 12m ice in the Cevp run. That points is at i=54, j=113. It doesn't seem to be increasing further, so maybe it's OK? 2 of the 4 neighbor points are land....

CICE-Consortium / CICE

C grid gx3 tests: Kimmritz et al. 2016 C1 configuration #702

hi:

uvel:

vvel: