COSIMA / access-om2

Deprecated ACCESS-OM2 global ocean - sea ice coupled model code and configurations.
21 stars 23 forks source link

document impact of payu ice restart bug #123

Closed aekiss closed 4 years ago

aekiss commented 5 years ago

Copying from slack:

Payu had a bug in handling ice restarts which has since been fixed: https://github.com/marshallward/payu/issues/138

We should document its impact on existing runs:

aidanheerdegen commented 5 years ago

A single timestep? Minimal I'd have thought. In other words, a worthy goal, but not sure there are resources to look into it, unless it is critical to the paper.

aekiss commented 5 years ago

It's not just a single timestep. It affects all of the first JRA55 or CORE forcing period (3 hours or 6 hours, respectively)

aekiss commented 5 years ago
aidanheerdegen commented 5 years ago

Ok, these are steps identified in the TWG meeting to remedy this issue

  1. Confirm payu/0.11.2 working correctly
  2. Set as default version
  3. Determine which payu versions affected
  4. Turn off affected modules in modulefile and issue message about bug, what module to load and to email climate_help if users still has issues
  5. When complain assess individual cases
  6. If necessary move payu module to non-app path
  7. Delete old versions?
aidanheerdegen commented 5 years ago

Can we get as many people as possible using payu/0.11.2? Ping @AndyHoggANU @navidcy @angus-g

I'll tell Ruth

navidcy commented 5 years ago

I was using it but there was bugs and switched back to v0.10. Shall I go back to latest?

On Dec 12, 2018, at 00:50, Aidan Heerdegen notifications@github.com wrote:

Can we get as many people as possible using payu/0.11.2? Ping @AndyHoggANU @navidcy @angus-g

I'll tell Taimoor and Ruth

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

aidanheerdegen commented 5 years ago

I was using it but there was bugs and switched back to v0.10. Shall I go back to latest?

Yes please. We tracked down the issue with 0.11.2 and it should work now. You aren't affected by the bug we're talking about, but we may need to deactivate old versions, so would like to make sure other users aren't affected.

AndyHoggANU commented 5 years ago

Yes, I will switch over ... but can we ensure that the config.yaml in Nic's default config uses syntax for 0.11.2? I had trouble with the syntax change on 0.10 and still haven't quite figured it out.

aekiss commented 5 years ago

The spreadsheets in /g/data3/hh5/tmp/cosima/access-om2-run-summaries tabulate the payu version used for each run, so we can determine exactly what was affected.

For the record, this bug affected the entire 0.1 deg IAF run 01deg_jra55v13_iaf in the model announcement paper and runs 414 onwards in the RYF spinup 01deg_jra55v13_ryf8485_spinup6 that formed its initial condition. The 1 deg and 0.25 deg runs in the paper were not affected.

aekiss commented 5 years ago

Buggy pre-0.11.2 versions of payu are still available:

module avail payu

----------------------------------------------------------------- /projects/v45/modules -----------------------------------------------------------------
payu/aek  payu/dev  payu/test

--------------------------------------------------------------- /apps/Modules/modulefiles ---------------------------------------------------------------
payu/0.10   payu/0.11.1 payu/0.11.2 payu/0.7    payu/0.8    payu/0.8.1  payu/0.9    payu/0.9.1  payu/0.9.2

Is there any reason not to disable these (see point 4 above)?

aekiss commented 5 years ago

Buggy pre-0.11.2 versions of payu are still available - is there any reason not to disable these as disussed?

aidanheerdegen commented 5 years ago

I have deactivated all the payu modules in projects/v45, but I don't have permissions do anything with the system modules. I doubt @marshallward does either now.

marshallward commented 5 years ago

I still have /apps permission, will be available in about an hour

On Mon, Aug 5, 2019 at 9:16 PM Aidan Heerdegen notifications@github.com wrote:

I have deactivated all the payu modules in projects/v45, but I don't have permissions do anything with the system modules. I doubt @marshallward https://github.com/marshallward does either now.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/COSIMA/access-om2/issues/123?email_source=notifications&email_token=AADQ32YJUYSPEFY6AFVVBDLQDDGIVA5CNFSM4GH7JNN2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3TQTLA#issuecomment-518457772, or mute the thread https://github.com/notifications/unsubscribe-auth/AADQ323JQPQNUGOER7OJANDQDDGIVANCNFSM4GH7JNNQ .

navidcy commented 5 years ago

Wow. They cancelled Marshall’s door card at 4:59 last day he was employed but they still allow him apps permissions. :)

On 6 Aug 2019, at 11:19, Marshall Ward notifications@github.com wrote:

I still have /apps permission, will be available in about an hour

On Mon, Aug 5, 2019 at 9:16 PM Aidan Heerdegen notifications@github.com wrote:

I have deactivated all the payu modules in projects/v45, but I don't have permissions do anything with the system modules. I doubt @marshallward https://github.com/marshallward does either now.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/COSIMA/access-om2/issues/123?email_source=notifications&email_token=AADQ32YJUYSPEFY6AFVVBDLQDDGIVA5CNFSM4GH7JNN2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3TQTLA#issuecomment-518457772, or mute the thread https://github.com/notifications/unsubscribe-auth/AADQ323JQPQNUGOER7OJANDQDDGIVANCNFSM4GH7JNNQ .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

marshallward commented 5 years ago

Questionable access permissions aside... I've done a 1.0 tag and installed it on Raijin. It's not yet enabled as default, which is still set to 0.11.2.

I've also disabled 0.9-0.11.1 at the request of @aekiss since it lacks YATM support and sounds like there were some other dangerous bugs.

We can make 1.0 the default once people are comfortable using it. I guess some announcement should be made somehow.

aekiss commented 5 years ago

OK thanks @marshallward

aekiss commented 5 years ago

It says here that test runs are underway to assess impact of this bug. Can anyone remember who was doing this and what was found? If not, should we close this issue? It has been noted in the tech report and the buggy payu versions are no longer available.