Include *all* years matching a warming level

perrette commented 1 month ago

Currently we include only the data points from a time-series when it first crosses a warming level. Why not include all data points that fall into the warming level?

The figure below (please ignore old CIE) shows the current rimeX emulator overlaid with the actual ISIMIP2 time-series. The black dots (not indicated in the legend) are the points we actually retain in the calibration. The 2.6 scenario for precipitation (top right) shows significant variability in the 21-yr running mean that is not captured by our emulator, because we simply don't use data points we could use without that self-imposed rule.

I am not sure where that rule came from in the first place. I understand this could be a conservative measure to avoid mixing data points from a "decline" scenario, but I'd argue they also contribute to the uncertainty. I rather think this was a "rule" useful in the previous methodology that is not useful here any more, since we can integrate that data naturally.

perrette commented 1 month ago

I report here from @NiklasSchwind

I also saw this and thought that it might have been a deliberate design choice - changing to using all points could also lead to worse results for more “usual” pathways as you would also consider values appearing after a long time lag at the same warming level. I certainly think it is worth a try though, I think it would substantially improve on getting low emission scenarios right. But (and I am sorry, I know you would probably rather keep it simple) I would make it optional at first - at least till we investigated how it effects the validation

perrette commented 1 month ago

And my reply here. I agree in general. Having only the first-encountered 21-year can be a meaningful design choice if the intent is to apply to continuously warming temperature pathways. Now in our application (possibly in contrast to @byersiiasa's rime), at least for now, we don't limit the emulator to monotonously increasing scenarios. In any case, besides any testing that should be done obviously, it is a question we might want to discuss with Carl & CA, precisely because it depends on the range of desired application. From my perspective though, the emulator calibration base should be as broad as possible, to avoid being overconfident (overconfidence is the worst thing one want to avoid). This is especially true because of the focus on strong mitigation scenarios (like 1.5C) at CA and IIASA.

In practical terms, in general, in term of making things optional or not, there should certainly be a simple way to try out various options in the code, this should be as generic and as little option-specific as possible (like a custom, user-defined path where calibration data should be stored), because we cannot possibly keep track of all variations and choices we have and will encounter. And on this particular issue, I believe we should meet a clear decision ASAP to move forward in a linear manner -- otherwise we just lose time, and I know what I am talking about (I do love exploring all options).

perrette commented 1 month ago

here you are @carlschleussner

cschleussner commented 1 month ago

Hi both, thanks for this. I went back to my notes, but can't recall a strong reason of why this is implemented this way. I also vaguely remember discussing with Peter back then that we'd have a bigger sample with all data points in it for lower GMT levels. But then we opted against this.

But tend to agree with your point Mahe, it would be actually useful to use all information we had. If we can make it optional (binary switch: Either only the first one, or all at that level, or only on the way 'down') that would be a nice feature, too. It would allow us to explore the differences in the way up/down in a systematic manner also - might be nice for the paper just to have that toggle-switch and show it in a figure. Is that what you had in mind also?

perrette commented 1 month ago

From the code perspective, all options can be explored, and the functions are made as flexible and explicit as possible. And there is a config file that sets the defaults, where a new option can be introduced that controls this behavior. However, I think it's important to decide on a default you don't have to think about everytime you use the tool. I'd be in favor of including all points by default, but again that's a design choice.

perrette commented 1 month ago

From this exchange, and to move forward, I suggest the following:

I'll have an explicit option in the config file to control that behavior
for now I'll include all points by default (so optional as "opt out")
in the course of the paper and investigation by @NiklasSchwind , we may come to the conclusion that another default is preferrable, and we can update the defaults are a result.

Let me know if you're OK with that, or if you want to keep the current default of not including all points. I won't tackle this right away, I expect to make a big push on the rime code the week of November 4th, so I'll proceed as stated here unless I hear otherwise from you.

cschleussner commented 1 month ago

Thanks! Agree with that proposal. Just to make sure, you're implementing two options (default all, optional:first), right?

As I see limited value in the 'first' only approach, an alternative version would be: "default all" "all pre GMT peak" "all post GMT peak"

Might require a bit more work though (because we'd need to know if and when peak warming had occurred in the underlying GMT time series). And I don't want to overcomplicate things :)

perrette commented 1 month ago

Yeah that was my intention. I also don't see much interest in the "first-only" approach, except that it tries to minimize any effect a time-lag might have at constant warming -- that could be fine if we'd only use it with fast warming scenarios, but we don't.

As far as peaking, I am not sure whether any issues that arise (I have not investigated it first hand but I do remember it being mentioned during Katja Frieler's work on pattern scaling) are related to the time lag or to other effects (thinking of negative atmosphere-ocean heat flux). It's not a big deal to flag any peaking and create warming table pre or post peak, but of course it would only apply to some scenarios and not other, and then there is the question of how to apply it. So probably you'd want to have two calibration tables (pre- and post-), and also diagnose pre- and post- peaking when using these tables on new temperature pathways to calculate. That is doable, but I wonder whether it's the smartest way to deal with that issue. On the one hand, it comes at a cost because it complicates the code (never a good thing for maintenance, bugs, need for testing etc), and on the other it's not clear to me how large the potential benefit is. I mean, is it really a binary thing ? (peaking, not peaking?) Maybe it depends more on the slope, or maybe on the integral of past warming (heat uptake)... And splitting the training data in two, with rather few scenarios providing training data for the post-peaking period, may introduce biases of its own. But I don't know. I'm just seeing the complications without having being confronted first-hand to the problems, and without a clear feeling that it would be a solution. My suggestion here would be to treat that as a separate issue on peaking and discuss that and try out various things when we want to dedicate time on this.

perrette commented 1 month ago

In any case for now I'll have "selection_mode" or such parameter that can take different values that we can extend over time.

cschleussner commented 1 month ago

Fair - not sure it’s so important for the CMIP6 badge - but might become in the future if we can train RIME actually on overshoot data. But that’s for a later day :) Good discussion, though.

On 17. Oct 2024, at 15:38, Mahé Perrette @.***> wrote:

Yeah that was my intention. I also don't see much interest in the "first-only" approach, except that it tries to minimize any effect a time-lag might have at constant warming -- that could be fine if we'd only use it with fast warming scenarios, but we don't.

As far as peaking, I am not sure whether any issues that arise (I have not investigated it first hand but I do remember it being mentioned during Katja Frieler's work on pattern scaling) are related to the time lag or to other effects (thinking of negative atmosphere-ocean heat flux). It's not a big deal to flag any peaking and create warming table pre or post peak, but of course it would only apply to some scenarios and not other, and then there is the question of how to apply it. So probably you'd want to have two calibration tables (pre- and post-), and also diagnose pre- and post- peaking when using these tables on new temperature pathways to calculate. That is doable, but I wonder whether it's the smartest way to deal with that issue. On the one hand, it comes at a cost because it complicates the code (never a good thing for maintenance, bugs, need for testing etc), and on the other it's not clear to me how large the potential benefit is. I mean, is it really a binary thing ? (peaking, not peaking?) Maybe it depends more on the slope, or maybe on the integral of past warming (heat uptake)... And splitting the training data in two, with rather few scenarios providing training data for the post-peaking period, may introduce biases of its own. But I don't know. I'm just seeing the complications without having being confronted first-hand to the problems, and without a clear feeling that it would be a solution. My suggestion here would be to treat that as a separate issue on peaking and discuss that and try out various things when we want to dedicate time on this.

— Reply to this email directly, view it on GitHubhttps://protect.checkpoint.com/v2/r02/___https://github.com/iiasa/rime/issues/33%23issuecomment-2419576120___.YzJlOmlpYXNhOmM6bzpmMGYzYzZhNWI4ODc3NmMyYTY4MGRmYzEyY2NhNTkwOTo3OjkyNTg6N2ZlZGMyYzgwZWMwYjIwMTU3YzI0YTY5NzAwOGFmMmI5ZmU2YjRkYWZkOTRiNjdiNTYwNGNkZDA4NmEwMDliMzpoOlQ6Tg, or unsubscribehttps://protect.checkpoint.com/v2/r02/___https://github.com/notifications/unsubscribe-auth/BKBMB6NF5YH5GK5I2QR3JJLZ364WBAVCNFSM6AAAAABQB6VTUKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMJZGU3TMMJSGA___.YzJlOmlpYXNhOmM6bzpmMGYzYzZhNWI4ODc3NmMyYTY4MGRmYzEyY2NhNTkwOTo3OjM3Mzk6OTI2N2RlMjc3Njk1ZTgxYTRmNjQ2YjRhMmUzZjRkODRlNmM3MzJiODRiM2E3YWU4MzZkYzM3ZmMyYWEyMThlYTpoOlQ6Tg. You are receiving this because you commented.Message ID: @.***>

perrette commented 1 month ago

As expected larger uncertainty when all data points are included. Only for pr RCP 2.6. As it should be.

cschleussner commented 1 month ago

yes, makes sense.

On 17. Oct 2024, at 17:48, Mahé Perrette @.***> wrote:

As expected larger uncertainty when all data points are included. Only for pr RCP 2.6. As it should be.

image.png (view on web)https://protect.checkpoint.com/v2/r02/___https://github.com/user-attachments/assets/520ccd2a-fc7c-486c-a38f-2515edcc830c___.YzJlOmlpYXNhOmM6bzoxYTg1YTdlOGRlNTgzNzMyNTczNWE3MzI2ZTJkMmYwYzo3OjRiMDk6NzQyNGNjMTY0ZmExYjI0MDg2ZTY1NmYxZDA4ZWJhMTlkN2RiZWI0YzhlNDE2YTgwZDRhMmNhNjNkMDhlYzNhOTpoOlQ6Tg

— Reply to this email directly, view it on GitHubhttps://protect.checkpoint.com/v2/r02/___https://github.com/iiasa/rime/issues/33%23issuecomment-2419910465___.YzJlOmlpYXNhOmM6bzoxYTg1YTdlOGRlNTgzNzMyNTczNWE3MzI2ZTJkMmYwYzo3OmM0MDE6MTBkZDg0MzFjNGQ0MDlmMTEzZDNjYjg4MTU2MTdkNzhlNmNiMmY3MjEyYzA3MTk5ZmFjMDhmYmYxN2JkYTM4YzpoOlQ6Tg, or unsubscribehttps://protect.checkpoint.com/v2/r02/___https://github.com/notifications/unsubscribe-auth/BKBMB6JP6T7ZL7TAMC6EDT3Z37L55AVCNFSM6AAAAABQB6VTUKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMJZHEYTANBWGU___.YzJlOmlpYXNhOmM6bzoxYTg1YTdlOGRlNTgzNzMyNTczNWE3MzI2ZTJkMmYwYzo3OjE1MDc6ZDYyMmU4MDQ0MmQ3ZmIyMzhlN2Y1YjdhMjI1ZTRmNjQ1ODBjOWNmMjhmYmM2MTBiMWM1OGYyOGQzZmJiNTUzNjpoOlQ6Tg. You are receiving this because you commented.Message ID: @.***>

perrette commented 1 month ago

And before closing this issue, I'd like to point out that the warming level file looks like: and the individual impact tables to feed into the emulator look like (the impact_data_records below):

That means the first (the global warming file) can be used to filter the data points of interest (before peak etc) before calling recombine_gmt_ensemble(impact_data_records, gmt_ensemble, quantiles) without having to reprocess everything everytime. So I'd say there is no need for further "preprocessing" option. That's an "emulator-level" option.

NiklasSchwind commented 1 month ago

Great, thanks!

iiasa / rime

Include all years matching a warming level #33

iiasa / rime

Include *all* years matching a warming level #33

Include all years matching a warming level #33