nci / scores

Metrics for the verification, evaluation and optimisation of forecasts, predictions or models.
https://scores.readthedocs.io/
Apache License 2.0
63 stars 16 forks source link

Kling-Gupta efficiency and percentage bias #613

Closed durgals closed 1 month ago

durgals commented 3 months ago

I would like the following metric, statistical test or data processing tool to be considered for addition to the scores repository
In hydrology, the Kling-Gupta Efficiency (KGE) and percentage bias (PBias) are commonly used metrics. Including these metrics would enhance the appeal of the scores package to a broader hydrological audience.

Please provide a reference that describes the metric, statistical test or data processing tool
KGE: Gupta, H.V., Kling, H., Yilmaz, K.K., Martinez, G.F., 2009. Decomposition of the mean squared error and NSE performance criteria: Implications for improving hydrological modelling. J. Hydrol., 377(1-2): 80-91.

PBias: https://search.r-project.org/CRAN/refmans/hydroGOF/html/pbias.html

tennlee commented 3 months ago

Thanks very much for raising those. I'd be happy to put both of those on our to-do list. Let us know if there is a particular deadline to consider and if we can accommodate it we will. Also, if it's something you would be interested in doing, we would welcome code contributions.

If I have understood correctly, percentage bias is the multiplicative bias of the error term, as a whole percentage number (i.e. 1-100 rather than 0-1). As such I would probably be inclined to implement it as percentage_error_bias (rather than percentage_bias) to avoid confusion. Does this seem reasonable to you?

In terms of our existing metrics, multiplicative bias is somewhat similar to percentage bias (https://scores.readthedocs.io/en/stable/api.html#scores.continuous.multiplicative_bias) - is that something you might use in the meantime?

durgals commented 3 months ago

Thanks very much for raising those. I'd be happy to put both of those on our to-do list. Let us know if there is a particular deadline to consider and if we can accommodate it we will. Also, if it's something you would be interested in doing, we would welcome code contributions.

If I have understood correctly, percentage bias is the multiplicative bias of the error term, as a whole percentage number (i.e. 1-100 rather than 0-1). As such I would probably be inclined to implement it as percentage_error_bias (rather than percentage_bias) to avoid confusion. Does this seem reasonable to you?

In terms of our existing metrics, multiplicative bias is somewhat similar to percentage bias (https://scores.readthedocs.io/en/stable/api.html#scores.continuous.multiplicative_bias) - is that something you might use in the meantime?

@tennlee: Thank you for your prompt reply and for considering our request to implement new matrices.

There is no strict deadline for implementing these metrics in scores, as we will use our verification scores until then. I am interested in contributing code.

Yes, percentage bias is the multiplicative bias of the error term (additive bias/mean(obs) * 100). Regarding naming, I do not mind with percentage_error_bias or percentage_additive_bias.

tennlee commented 3 months ago

@durgals If you're happy to contribute code, that would be warmly appreciated and would definitely speed things up. I'd suggest starting with reading though the following implementations (these should link directly to our code in GitHub for additive and multiplicative bias): https://github.com/nci/scores/blob/b57325318b698e17a970a448f96f3adfe13f0eee/src/scores/continuous/standard_impl.py#L244 https://github.com/nci/scores/blob/b57325318b698e17a970a448f96f3adfe13f0eee/src/scores/continuous/standard_impl.py#L293

We have a contributor's guide available here : https://scores.readthedocs.io/en/stable/contributing.html There is also a pull request checklist here: https://github.com/nci/scores/blob/develop/.github/pull_request_template.md

Those docs can seem lengthy and maybe a bit overwhelming but we'll happily help out along the way with anything and don't expect PRs to be perfect first time. I'd suggest making a fork of the repo, getting started, and letting us know how that goes.

If percentage bias goes well, we can then move on to KGE.

How does that sound?

tennlee commented 3 months ago

In terms of what to name the method in the module, I can think of:

Either seems fine to me. I might ask @nicholasloveday or @rob-taggart to put an opinion also if it's important to them

durgals commented 3 months ago

Thank you for the links to the implementations and the contributor's guide.

I'll fork the repository, begin working on the percentage bias, and reach out if I have any questions or need further assistance. Regarding the naming, since percentage bias is a modification of additive_bias (additive_bias/mean(obs)), I prefer additive_bias_percentage to error_bias_percentage.

nicholasloveday commented 3 months ago

One thing to think about is whether pbias (or whatever it is renamed as) should fit within the additive and multiplicative bias tutorial or if it should fit in its own tutorial. My preference is to extend the existing bias tutorial with pbias rather than create a separate one. https://github.com/nci/scores/blob/develop/tutorials/Additive_and_multiplicative_bias.ipynb

durgals commented 2 months ago

Hi @tennlee , I was testing the additive_bias_percentage function, but it failed due to a precision error (see below). The expected value is [-50, -100., (0.5/3+0.5/3)/(-0.5/3)*100], which is [-50, -100., -200.]. The test failed for both values due to precision issues with the -200.0 value. Then I used xr.testing.assert_allclose(result, expected,rtol=1e-10, atol=1e-10) instead of xr.testing.assert_equal and the test passed. Is this ok to use assert_allclose? image

nicholasloveday commented 2 months ago

Hi @durgals - we have used xr.testing.assert_all_close several times in the repository (e.g., https://github.com/search?q=repo%3Anci%2Fscores%20assert_allclose&type=code)

I think it is fine to use to deal with floating point precision issues.

tennlee commented 2 months ago

Thanks for checking.

Yes, fine by me also to use assert_all_close to handle floating point precision errors, we're not trying to second-guess underlying library math implementations.

Perhaps we should add a note on precision somewhere in the documentation.

tennlee commented 2 months ago

In addition, while it's totally fine to wait until you're ready to create the pull request, you are also very welcome to create a pull request while your code is still under development. Our convention is to mark such things with (under development) at the start of the name of the pull request. Some developers find this helpful so they can ask questions about particular parts of the code as they go, but it's totally up to you whether you'd like to work that way. I just thought I'd mention it.

tennlee commented 2 months ago

Hi @durgals. While the PR for percent bias is still undergoing review, it's looking really good. If you feel motivated, you would be welcome to open a PR for KGE in parallel rather than feeling like you need to wait. It's up to you, either way is fine, but you obviously have the hang of things so it would be reasonable to proceed with confidence.

durgals commented 2 months ago

Thanks, @tennlee. Yes, I've considered creating another KGE branch while percent bias is under review. I actually have working code; the only remaining task is to move it to scores and conduct rigorous testing.

tennlee commented 1 month ago

@durgals I am going to close this issue since it contains general discussions and a way forward is now agreed. Please create new issues for each new score you would like included.