CLIMADA-project / climada_python

Python (3.8+) version of CLIMADA
GNU General Public License v3.0
300 stars 118 forks source link

Knutson's TC climate change effect is double-counted #453

Open bguillod opened 2 years ago

bguillod commented 2 years ago

I was looking into the code which is used to apply Knutson's criteria to generate future TC event sets and I stumbled upon something that I find very weird (a bug I would say).

Knutson's work provides change in frequency and in intensity for each TC category and basin. These changes are however inherently related to each other as they, in the end, aim to describe shifts in frequency curves from two different standpoints:

In the code, changes in frequency and in intensity are BOTH applied. This is fundamentally wrong.

For the avoidance of doubt, I made a quick search and stumble upon this paper, which also describes this feature of Knutson's analysis: https://rmets.onlinelibrary.wiley.com/doi/epdf/10.1002/qj.4299 This also provides some recommendation on the application of Knutson's change estimates.

I think that anyway the application of the Knutson's criteria need to be revised at some point (Knutson 2015 is kind of outdated and the same author published more recent and comprehensive estimates), but in the meantime at least this clear bug should be fixed in my opinion.

I am not sure who in CLIMADA's community is using or working with climate change for TC though.

chahank commented 2 years ago

Yes, this issue was identified and discussed last year. A fix pull request was made, but other issues quickly arose (such as having effectively 0 frequency). My conclusion was that Knutson parameters are just really not sure for scaling TCs as defined in CLIMADA. In particular, to do a good scalping one would need access to the data underlying the Knutson study.


From: Benoit P. Guillod @.***> Sent: Thursday, 19 May 2022 10:32:11 To: CLIMADA-project/climada_python Cc: Subscribed Subject: [CLIMADA-project/climada_python] Knutson's TC climate change effect is double-counted (Issue #453)

I was looking into the code which is used to apply Knutson's criteria to generate future TC event sets and I stumbled upon something that I find very weird (a bug I would say).

Knutson's work provides change in frequency and in intensity for each TC category and basin. These changes are however inherently related to each other as they, in the end, aim to describe shifts in frequency curves from two different standpoints:

In the code, changes in frequency and in intensity are BOTH applied. This is fundamentally wrong.

For the avoidance of doubt, I made a quick search and stumble upon this paper, which also describes this feature of Knutson's analysis: https://rmets.onlinelibrary.wiley.com/doi/epdf/10.1002/qj.4299 This also provides some recommendation on the application of Knutson's change estimates.

I think that anyway the application of the Knutson's criteria need to be revised at some point (Knutson 2015 is kind of outdated and the same author published more recent and comprehensive estimates), but in the meantime at least this clear bug should be fixed in my opinion.

I am not sure who in CLIMADA's community is using or working with climate change for TC though.

— Reply to this email directly, view it on GitHubhttps://github.com/CLIMADA-project/climada_python/issues/453, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABVWZKM6Y6LKGIDY746RZHTVKX4AXANCNFSM5WLMTBBA. You are receiving this because you are subscribed to this thread.Message ID: @.***>

[ { @.": "http://schema.org", @.": "EmailMessage", "potentialAction": { @.": "ViewAction", "target": "https://github.com/CLIMADA-project/climada_python/issues/453", "url": "https://github.com/CLIMADA-project/climada_python/issues/453", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { @.": "Organization", "name": "GitHub", "url": "https://github.com" } } ]

chahank commented 2 years ago

To be more precise, @bguillod , the PR in question was https://github.com/CLIMADA-project/climada_python/commit/8084d9a894083d0ad00a0524d0da384f9d4e4374 . Could you please have a look and see whether this would have solved the mentioned issue? If not, could you propose a solution?

A few points to mention/discuss:

My take on this is that depending on what you are interested in, the current implementation is perfectly fine. For more advanced analysis, in particular regarding the change in the geographical occurrence of storms, one should use other datasets, such as those from K. Emmanuel. In this regard, please consider reading https://assets.researchsquare.com/files/rs-1429968/v1_covered.pdf?c=1648231863 .

bguillod commented 2 years ago

Thanks @chahank for your answer. I do not see where the PR you mention tackles the specific issue I raised here - was the PR not rather about avoiding negative frequencies? All the points you mention are valid ones, I fully agree and indeed they are worth investigating at some point. Also, for those points I agree that "depending on what you are interested in, the current implementation is perfectly fine" (or at least it's a good starting point). However, the specific point I mention is one where the current implementation is fundamentally wrong, isn't it?

chahank commented 2 years ago

The PR is about implementing the change in frequency and intensity exactly as you discussed above. Namely, by considering the changes not category by category, but considering the cumulative distribution.

In a sense it is fundamentally wrong. But, in a sense you could argue that it will always be fundamentally wrong to use the Knutson parameters obtained on specifics dataset and apply them to completely different datasets, in particular without having the full analysis output at hand. The important question is whether the key drivers are captured reasonably well given all the uncertainties. And this, I think, can be answered with yes depending on the case study. The issue you raised is just one among several, and I am not convinced that solving this one (and thereby introducing others) will be fundamentally better. This was the conclusion of the PR https://github.com/CLIMADA-project/climada_python/commit/8084d9a894083d0ad00a0524d0da384f9d4e4374 . But please feel free to propose a solution, then we can discuss which one is better.

chahank commented 2 years ago

To be clear. Any improvement of the climate model for TC is very much welcome! So thanks @bguillod for restarting the discussion.

bguillod commented 2 years ago

Aaah, now I understood what you mean was solved in that PR, @chahank ! You mean that for a cat 4 storm you should not apply changes of cat>1 and those of cat >3 and those of that >=4, for example, but only one of them. Is that what you mean? This is correct indeed.

However the issue I am raising is yet another one: it is that you should not apply changes in both intensity and frequency at the same time. Either you adjust intensity, or frequency. Either one can be seen as a by-product of the other one. If you adjust intensity only, frequency of a given category will change as a side-effect so you should not additionally change frequency.

Is this clearer? If not we can set up a short call to discuss directly as this might be easier.

chahank commented 2 years ago

Thanks for the clarification. What you are pointing out is part of the same problem.

Basically, the Knutson paper does not disclose the full distribution shift, but only the cumulative changes per intensity and per frequency. This raises several issues:

My take is that one can use the Knutson scaling parameters (and the exact implementation is not that important) for certain applications only. Mostly for large-scale impact estimates. With a proper uncertainty treatment, this should produce acceptable risk estimates.

bguillod commented 2 years ago

For the first point (i.e. the one I raised initially in this issue), my suggestion would be to only apply change to intensity. Frequency of a given TC category will then be affected as a side-effect. Of course, one should then define whether all intensity changes are applied independently of significance, or only the statistically significant ones - or as a third option apply changes in intensity whenever either intensity or frequency changes are significant in Knutson's work.

chahank commented 2 years ago

I still disagree. While it is true that some double counting will happen, just not changing the frequency will result in an undercounting.

A good way forward would be to update the whole method based on the latest raw data from the Knutson 2020 paper, using the recommendation from the Jewson 2021 paper. Would you like to do that?

bguillod commented 2 years ago

I still disagree. While it is true that some double counting will happen, just not changing the frequency will result in an undercounting.

Why would it lead to undercounting? I cannot agree here without more justification as why you think that would be the case.

A good way forward would be to update the whole method based on the latest raw data from the Knutson 2020 paper, using the recommendation from the Jewson 2021 paper. Would you like to do that?

That would be great and would be my preferred approach. But I'd need to first look into these two papers a bit more in detail before I can answer whether that would work, how it could be implemented, and ultimately whether and when I might find time to implement this.

chahank commented 2 years ago

Why would it lead to undercounting? I cannot agree here without more justification as why you think that would be the case.

Sure, you're right. My point is actually not as strong as I have written it. Sorry for that. Here is my line of thought:

The full distributions are collapsed either along the frequency, or the intensity axis. Then, per storm category, the values are aggregated and one relative value is obtained. Thus, the shape of the distribution in each category is lost. This means that taking only the shift along one of the data axis (e.g. intensity), bears the risk to underestimate the changes.

Furthermore, the risk is the product of the frequency and the value of the non-linear impact function evaluated at a given intensity. Thus, changing intensity or frequency do not result in the same change in risk. Hence, while looking only at the hazard, it might be equivalent to shift either frequency or intensity, this might not be the case for the impact.

Does this make sense?