Open peterdudfield opened 5 months ago
hello, can I be assigned to this issue?
We have been looking into this issue and we think that the reason why the predictions are so spiky is because we download input data for every 15 minutes and use these for predictions. Using hourly data results in much smoother plots. Here is an example. I'm not sure if smoothing would be an appropriate approach here or, when you want the plots/results less spiky we can simply use lower frequency data. The model might not be optimal for this higher frequency as it was trained on hourly data.
Hi @froukje
If it was trained on hourly data, should probably use hourly data in inference as well. I would probably go for that fix first. Are you able to make a PR for this? Thanks
hello, can I be assigned to this issue?
Thanks @Plomo-02, its probably best @froukje has a go at this first.
Yes, sure. No problem.
This issue can be closed. The predictions have been changed to hourly data.
Can kalman filters be used just before plotting? Even with the 15 min data, it would smoothen the curve - or even PID algorithm, it would minimize the large spikes caused by any noise in the pv
The current xgboost model is quite spike. This is likely due to the ML learning model
It might be worth smooething this? And we probably need to make sure we smooth this before the night time filter
You can see this here