Closed zhangshixuan1987 closed 1 week ago
@zhangshixuan1987 thank you for the report. PMP had this (overwriting model annual cycle time axis by that of obs) but I removed it at some point presuming no longer such issues from reference datasets processed via obs4MIPs. However, I see your point and it might be worth to consider bring it back to improve flexibility in the reference dataset selection.
One thing to think about is the case of annual average calculating. Extreme case can be models with 360 calendar that its months has to be equally weighted to average, but overwriting its time to the observation can result inaccurate weighting. Let me give some more thoughts on this to see if there is any systematic and easy way to handle it. During the mean time please feel free to post any thoughts or suggestions.
@lee1043 Hi Jiwoo, I agree that this would not be an issue if all the data were processed in a consistent way (e.g. using obs4MIPs method, maybe the users should do so in order to make sure the comparison between model and observations in a fair manner). While this indeed reduced the flexibility if users want to some established datasets, e.g. in my case, I was trying to directly use the observational climatology datasets derived by E3SM diagnostics, which is part of the reason that caused the failure of calculation in my case.
I added this correction as an option, especially when observation does not have CF-compliant time axis, but now I think I am more convinced to extend this as a default procedure if this continue to be an issue with @zhangshixuan1987's workflow. @zhangshixuan1987 What is the current status?
@lee1043: Hi Jiwoo, could you please explain what do you mean by "What is the current status?"
@zhangshixuan1987 sorry I was not clear enough. I was wondering if you have this issue continue with your reference dataset so whether you had to use any workaround, which in this case, I can make the time sync between model and obs as a default process.
@lee1043: I just mentioned in the e-mail I sent to you that the workflow for E3SM-pmp was designed to derive the observational and model climatologies on the fly with the available monthly datasets in E3SM diagnostics. I currently designed two methods to achieve this:
I hope this can answer your question. Overall, I think that the proposed changes here will be a simple but possibly dirty way to avoid the potential issues.
Thank you!
With best wishes,
Shixuan
@zhangshixuan1987 thank you for the reminder and sharing your workflow. I now updated PR #1162 to fully address this issue by time sync between model and observation as default for mean climate metrics as they are using only 12 time steps from annual cycle. I made observation time to be overwritten by model time because usually they have better CF-compliant time axis. Once PR #1162 is merged, this might enable skipping the method 2 in your comment.
Thanks! @lee1043 This sounds great to me.
What happened?
I tried to generate the mean climate metrics with PCMDI by comparing model simulations with observations. My simulation data here includes the pre-industrial control simulations ( model output time started from year 1), AMIP simulations (model output time started from year 1850 or 1950), and coupled historical simulations (model time started from year 1850 or 1950).
Following the tutorial on the PCMDI webpage, I was able to obtain a model of mean climate metrics expected from PCMDI except for the "rms_xyt" metric. I always got a NAN value in the result files. I explored my calculation procedure carefully, and found that the issues were pointed to the ``rms_xyt(dm, do, var=None)" function in the following code:
Specifically, the time dimension in dm (annual cycle of model mean climatology) and do (annual cycle of observation mean climatology) in my case were inconsistent. In my case,
Note that both data include 12 time samples for annual cycle, but they slightly differ in the decimal point of the time. Such small differences can invalidate the calculation such as ``ds["diff_square"] = (dm[var] - do[var]) ** 2" as python will think that the two data are not aligned with each other.
To my understanding, such differences in the time between "dm" and "do" can be easily hit in many cases, given that the model usually uses "noleap calendar" and observations often use "gregogren calendar". Therefore, I made some modifications in the ``compute_statistics.py":
The idea here is to use the observation as reference, and change the model time to be always consistent with the observations. In this way, the time in "dm" and "do" are guaranteed to be always aligned with each other. With these changes, the issues I mentioned above are resolved, and I can obtain valid "rms_xyt" metrics.
I report my use case of PCMDI here to see if this indicates a potential vulnerability in the calculation of PCMDI mean climate metrics.
What did you expect to happen? Are there are possible answers you came across?
No response
Minimal Complete Verifiable Example (MVCE)
No response
Relevant log output
No response
Anything else we need to know?
No response
Environment
The version of PCMDI in my case is an earlier version "pcmdi_metrics-3.2-pyhd8ed1ab_0" which was installed on the Argon Chrysilis machine using the "conda install" method, and Python 3.10.10 was used.