georgebv / pyextremes

Extreme Value Analysis (EVA) in Python
https://georgebv.github.io/pyextremes/
MIT License
237 stars 47 forks source link

How can I compute the probability $P(X > value)$? #87

Closed nucflash closed 1 month ago

nucflash commented 1 month ago

Hi, thank you for the great effort to bring Extreme Value Analysis to Python! It is extremely useful to have something like this in Python as it plays so well with all other data science tools that are around.

I'm following Fawcett's tutorial/lectures, and in one of his example websites he shows an example of "Choose a Wave height to find the probability of exceeding it every year", this translates to calculating $P(X > value)$, where $value$ is the height of a wave in feet.

I'm trying to figure out how to achieve the same in pyextremes. I looked at the tutorials and the code but I can't figure out a way, but I'm sure that one must exist. Can you please show me how?

Thanks again for the great work!

Screenshot 2024-08-01 at 9 27 39 PM
nucflash commented 1 month ago

I think I found it, I think it is:

1 - model.model.cdf(value)

CDF returns $P(X < value)$ so if I want the opposite I ask for $1 - P(X < value) = P(X > value)$

My next question is:

Is it possible to calculate this probability within a specific return period? The question I want to answer is "What is the probability of achieving value in the first year, the second year, etc.".

Thanks!

georgebv commented 1 month ago

You are correct. Also don't forget to factor in rate of extremes when converting return period into probability when using POT.

Regarding your second question, extreme events are assumed to be random IID (Independent and Identically Distributed) so it doesn't matter which year you look at. The question you can answer though is What is the probability of wave height exceeding X at least once in N years?. That would be inverse of probability of not observing it N years in a row.