Open JasonFengGit opened 7 months ago
Thanks @JasonFengGit for this
We'll have to think how to perhaps create a new test dataset that doesnt have any missing generation values
We could filter out timestamps with missing values, but that would introduce some biases that are hard to analyze.
We could filter out timestamps with missing values, but that would introduce some hard to explain bias.
I think we could filter out the missing ones, and introduce new ones. As long as we then do some analysis on the new test set and check its not bias, then it should be ok.
What bias' were you thinking about?
For example, the missing values might be due to similar reasons and could share some patterns that are either easier or harder to predict, thereby making the evaluation biased.
For example, the missing values might be due to similar reasons and could share some patterns that are either easier or harder to predict, thereby making the evaluation biased.
ah I see, from what I've seen, there are normally quite random as they are all random pv panels throughout the UK. But we can check this
Oh OK! That would make it easier.
@JasonFengGit Nice spot! thanks for this
Describe the bug
In evaluation, some of the real/expected values of generation_power are missing.
To Reproduce
Steps to reproduce the behavior:
python scripts/run_evaluation.py
with the followingtestset.csv
(a small test to illustrate the bug):results.csv
in the generation_power columns Exampleresults.csv
:Expected behavior
No missing values (or maybe some fallbacks to handle missing values).