ESCOMP / CTSM

Community Terrestrial Systems Model (includes the Community Land Model of CESM)
http://www.cesm.ucar.edu/models/cesm2.0/land/
Other
308 stars 312 forks source link

A bug in calculating accumulated fields (24/240 hours averaged) when using a smaller timestep #1789

Closed Duseong closed 2 years ago

Duseong commented 2 years ago

Brief summary of bug

It looks like there's a bug in calculating accumulated fields (e.g., TV24, TV240, PAR24_sun, PAR240_sun, etc.) when using a smaller timestep.

General bug information

CTSM version you are using: ctsm5.1.dev019 (cam6_3_018)

Does this bug cause significantly incorrect results in the model's science? Yes

Configurations affected: Regional refinement simulations or any configurations that use a much smaller timestep than the global 1-degree model.

Details of bug

Please see the attached file to see the time series of variables (TV, TV24, TV240, PAR_sun, PAR24_sun, PAR240_sun) at Manaus point over Amazon. Black dots are from the ne30 simulation, and red dots are from the same ne30 simulation but with a different timestep (3.75 minutes -> ATM_NCPL = LND_NCPL of 384). There are unexpected big fluctuations in 24/240 hours averaged fields in the smaller timestep run, so I guess there is a bug when calculating those fields. Since regional refinement simulations use smaller timestep, results from those models are heavily affected by these fields. I found that global biogenic emissions, OH, and other chemical fields like CO were changed substantially. It may also affect other atmospheric fields. Accumulated_fields_time_series.pdf

Important details of your setup / configuration so we can reproduce the bug

I changed ATM_NCPL (=LND_NCPL) from 48 to 384 to see the effects of the timestep on those fields. The spatial resolution was the same for both (ne30).

billsacks commented 2 years ago

Thank you for opening this issue. I agree that this appears to be a significant bug!

From looking through the code, I think I see what's happening here: the PERIOD of accumulated fields is read from the initial conditions file, but I don't think it needs to be, and doing so is problematic if the run you're doing uses a different time step than the time step used in creating the initial conditions file originally.

Can you try redoing your test after rebuilding the code with this block of code deleted:

https://github.com/ESCOMP/CTSM/blob/25a7cd3f240787d3c798162def26d9c60d9871da/src/main/accumulMod.F90#L767-L776

If that gives other problems, then a simpler but safer experiment would be to redo your test after setting ./xmlchange CLM_FORCE_COLDSTART=on so that the model doesn't use a restart file at all. (That won't be good for science, but if my hypothesis is right, then it should get around this issue. I'd like to see if that's true.)

billsacks commented 2 years ago

(Note to self and @samsrabin : usually my experience is that it's a bad idea to get side-tracked going down tangential rabbit holes, but apparently this one https://github.com/ESCOMP/CTSM/pull/1684#discussion_r828546304 was worth following up on....)

Duseong commented 2 years ago

Thank you for opening this issue. I agree that this appears to be a significant bug!

From looking through the code, I think I see what's happening here: the PERIOD of accumulated fields is read from the initial conditions file, but I don't think it needs to be, and doing so is problematic if the run you're doing uses a different time step than the time step used in creating the initial conditions file originally.

Can you try redoing your test after rebuilding the code with this block of code deleted:

https://github.com/ESCOMP/CTSM/blob/25a7cd3f240787d3c798162def26d9c60d9871da/src/main/accumulMod.F90#L767-L776

If that gives other problems, then a simpler but safer experiment would be to redo your test after setting ./xmlchange CLM_FORCE_COLDSTART=on so that the model doesn't use a restart file at all. (That won't be good for science, but if my hypothesis is right, then it should get around this issue. I'd like to see if that's true.)

Thanks a lot for the quick response! I will give it a spin and will let you know if it's corrected!

Duseong commented 2 years ago

Thanks again for your comment! Accumulated_fields_time_series_bugfix.pdf

Please see the attached file, which now includes another simulation (blue dots) with your suggestion for the bug fix.

I think it solved the problem, now accumulated fields get much closer to the base simulation.

It's only for 5-day simulation results, so I will do a longer run for the final confirmation, but I don't think that a longer run will show different results.

billsacks commented 2 years ago

Thanks a lot for trying that out and posting the results! I'm not too surprised to see that it takes some time for the accumulation fields to adjust to the new time step. It looks to me like after this initial period they appear to be doing the right thing. I'm planning to move ahead with this fix, but please do let us know if you see other problems.

adamrher commented 2 years ago

@Duseong says

Since regional refinement simulations use smaller timestep, results from those models are heavily affected by these fields. I found that global biogenic emissions, OH, and other chemical fields like CO were changed substantially. It may also affect other atmospheric fields.

I just want to verify that this issue is purely a problem w/ diagnostic output, and not something that impacts other fields as indicated by this comment above.

lkemmons commented 2 years ago

This does have a real impact on MEGAN biogenic emissions.

On Tue, Jul 5, 2022 at 3:46 PM Adam Herrington @.***> wrote:

@Duseong https://github.com/Duseong says

Since regional refinement simulations use smaller timestep, results from those models are heavily affected by these fields. I found that global biogenic emissions, OH, and other chemical fields like CO were changed substantially. It may also affect other atmospheric fields.

I just want to verify that this issue is purely a problem w/ diagnostic output, and not something that impacts other fields as indicated by this comment above.

— Reply to this email directly, view it on GitHub https://github.com/ESCOMP/CTSM/issues/1789#issuecomment-1175531671, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH5BH7KOPVPN7EEYE45RJSLVSSUNDANCNFSM52HKNKWA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

adamrher commented 2 years ago

oy vey. OK thanks.

Duseong commented 2 years ago

It's only for 5-day simulation results, so I will do a longer run for the final confirmation, but I don't think that a longer run will show different results.

And Louisa and I conducted several months simulations, and again we confirmed that the method by Bill solved the problem.

lkemmons commented 2 years ago

The accumulMod routines are used in many other parts of CLM as well. We have not investigated the impact on other CLM fields.

adamrher commented 2 years ago

I'd be interested to know whether our standard CAM6 low-top configuration (not CAM-CHEM) is impacted by this bug. What species are considered MEGAN emissions? Does it refer to aerosol species or chemical species?

In the CAM configuration I'm referring to, these chemical species are dynamically active (i.e., dycore tracers):

DMS, H2O2, H2SO4, SO2, SOAG

... and the aerosol species:

bc_aX, dst_aX, ncl_aX, num_aX, pom_aX, so4_aX, soa_aX
lkemmons commented 2 years ago

CAM alone does not use MEGAN. However these accumulated averages are used throughout CLM for a variety of parameters. I have not yet had a chance to look at the impact on physical parameters in CLM, nor the impact on T, Q, etc. in CAM.

billsacks commented 2 years ago

@adamrher - just to echo what @lkemmons said, I would (unfortunately) expect this bug to have some impact on any simulation with a non-30-minute time step, but I don't have a sense of how large the impact will be.

dlawrenncar commented 2 years ago

My guess, but mainly a guess, is that it wouldn't have much impact on the physical simulation, especially if using the prescribed vegetation (CLMSP) configuration.

On Mon, Jul 11, 2022 at 1:22 PM Bill Sacks @.***> wrote:

@adamrher https://github.com/adamrher - just to echo what @lkemmons https://github.com/lkemmons said, I would (unfortunately) expect this bug to have some impact on any simulation with a non-30-minute time step, but I don't have a sense of how large the impact will be.

— Reply to this email directly, view it on GitHub https://github.com/ESCOMP/CTSM/issues/1789#issuecomment-1180669965, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFABYVBSMODQQG2LQ2224RLVTRJ4HANCNFSM52HKNKWA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

billsacks commented 2 years ago

I have done some more investigation of this. Based on a quick search through the code (i.e., I may have missed something), it looks like, in an SP (i.e., non-BGC) case, the only accumulation field that impacts aspects of the model other than VOC emissions is T10 - the 10-day running mean air temperature - which appears in a few places in the photosynthesis calculations. In a run whose time step isn't too different from 30 minutes – e.g., a 15-minute time step – I wouldn't expect this bug to make much difference in an SP case. But in a run with a much shorter time step, the differences would become larger. I don't have a feeling for how important this T10 variable is in the photosynthesis calculation, but I do see that it has some impact on the evolution of the model.

More accumulation fields come into play in BGC simulations, e.g., impacting various phenology calculations.

ManYue07 commented 2 years ago

Thank you for opening this issue. I agree that this appears to be a significant bug!

From looking through the code, I think I see what's happening here: the PERIOD of accumulated fields is read from the initial conditions file, but I don't think it needs to be, and doing so is problematic if the run you're doing uses a different time step than the time step used in creating the initial conditions file originally.

Can you try redoing your test after rebuilding the code with this block of code deleted:

https://github.com/ESCOMP/CTSM/blob/25a7cd3f240787d3c798162def26d9c60d9871da/src/main/accumulMod.F90#L767-L776

If that gives other problems, then a simpler but safer experiment would be to redo your test after setting ./xmlchange CLM_FORCE_COLDSTART=on so that the model doesn't use a restart file at all. (That won't be good for science, but if my hypothesis is right, then it should get around this issue. I'd like to see if that's true.)

Hi, Bill, @billsacks I have q quick question about this bug. Can this bug be interpreted as a time step of fewer than 30 minutes resulting in inconsistent time steps in CLM and CAM? Thus further affecting the simulation results? If so, how to understand the impact of this inconsistency?

billsacks commented 2 years ago

Can this bug be interpreted as a time step of fewer than 30 minutes resulting in inconsistent time steps in CLM and CAM?

Not exactly. The issue is more subtle: CTSM has a number of accumulation fields that accumulate averages over some period. These accumulation fields weren't properly handling a change in time step (relative to what was used to generate the initial conditions file). So, for example, if you are using a 15-minute time step with an initial conditions file that originated from a run with a 30-minute time step (at some point in its history), then an average that was supposed to be 10-day instead becomes 5-day; an average that was supposed to be 1-day becomes 12-hour, etc. (The issue is that the number of time steps in the averaging period was staying fixed rather than the actual amount of time staying fixed.) It appears that the biggest impacts are on VOC emissions and in BGC runs; we expect the impact to be small (but still non-zero) in prescribed phenology (SP) runs that don't use VOC emissions.

ManYue07 commented 2 years ago

Can this bug be interpreted as a time step of fewer than 30 minutes resulting in inconsistent time steps in CLM and CAM?

Not exactly. The issue is more subtle: CTSM has a number of accumulation fields that accumulate averages over some period. These accumulation fields weren't properly handling a change in time step (relative to what was used to generate the initial conditions file). So, for example, if you are using a 15-minute time step with an initial conditions file that originated from a run with a 30-minute time step (at some point in its history), then an average that was supposed to be 10-day instead becomes 5-day; an average that was supposed to be 1-day becomes 12-hour, etc. (The issue is that the number of time steps in the averaging period was staying fixed rather than the actual amount of time staying fixed.) It appears that the biggest impacts are on VOC emissions and in BGC runs; we expect the impact to be small (but still non-zero) in prescribed phenology (SP) runs that don't use VOC emissions.

Thank you so much for the explanation, it helped a lot.