openclimatefix / Open-Source-Quartz-Solar-Forecast

Open Source Solar Site Level Forecast
MIT License
43 stars 40 forks source link

Should we smooth the xgboost model #124

Open peterdudfield opened 1 month ago

peterdudfield commented 1 month ago

The current xgboost model is quite spike. This is likely due to the ML learning model

It might be worth smooething this? And we probably need to make sure we smooth this before the night time filter

You can see this here Screenshot 2024-05-30 at 15 31 35

Plomo-02 commented 1 month ago

hello, can I be assigned to this issue?

froukje commented 1 month ago

We have been looking into this issue and we think that the reason why the predictions are so spiky is because we download input data for every 15 minutes and use these for predictions. Using hourly data results in much smoother plots. Here is an example. I'm not sure if smoothing would be an appropriate approach here or, when you want the plots/results less spiky we can simply use lower frequency data. The model might not be optimal for this higher frequency as it was trained on hourly data.

Screenshot from 2024-05-31 13-27-04

peterdudfield commented 1 month ago

Hi @froukje

If it was trained on hourly data, should probably use hourly data in inference as well. I would probably go for that fix first. Are you able to make a PR for this? Thanks

peterdudfield commented 1 month ago

hello, can I be assigned to this issue?

Thanks @Plomo-02, its probably best @froukje has a go at this first.

froukje commented 4 weeks ago

Yes, sure. No problem.

froukje commented 3 weeks ago

This issue can be closed. The predictions have been changed to hourly data.