openclimatefix / predict_pv_yield

Using optical flow & machine learning to predict PV yield
MIT License
52 stars 11 forks source link

Try without historical GSP PV #90

Open JackKelly opened 2 years ago

JackKelly commented 2 years ago

We should see how PV forecasting performance changes if we don't have historical GSP PV.

The GSP PV data we'll have at inference time will be less accurate than the historical PV Live data we have downloaded. In real time, Sheffield Solar only get data from 1k PV systems. The historical data is based on 20k PV systems.

Sheffield Solar don't archive the intra day PV Live estimates.

I'll ask ESO if they archive the intra-day PV Live estimates.

The basic problem is that, when we validate our PV forecasts using held-out historical data, then we're likely to see a more flattering score than when we run in production (because, in production, we won't have access to such accurate PV Live data). But, if we find that our models work fine without any historical PV Live data, then we can be more confident that the performance we see in validation will be repeated in production.

JackKelly commented 2 years ago

Quick update:

ESO might archive intra-day regional PV Live. Lyndon will ask around.

Owen at Shef Solar might have re-run some intra-day analyses. Jamie will ask Owen for us.

Perhaps OCF should start archiving intra-day regional PV Live??

I'm a little worried that the difference between intra-day and updated regional PV Live might actively hurt our PV models in production, so we might be better off without giving our models the recent history of PV Live regional (and, instead, get the model to focus on using PVOutput.org data)

peterdudfield commented 2 years ago

code is ready- just need to run the model

peterdudfield commented 2 years ago

This has been run - did not make much difference. Need to run it twice, (with and without) to provide some numbers