openclimatefix / nowcasting_dataset

Prepare batches of data for training machine learning solar electricity nowcasting data
https://nowcasting-dataset.readthedocs.io/en/stable/
MIT License
24 stars 6 forks source link

Add map to each ML training example showing location of installed PV systems #184

Open JackKelly opened 2 years ago

JackKelly commented 2 years ago

Detailed Description

At the moment, for each ML example, we ask our ML models to directly predict total solar PV power generation for an entire region of the country (specifically: a region that's electrically connected to a grid supply point (GSP). A GSP is basically a huge electricity substation that is the boundary between the transmission system and the distribution system).

We get the estimated total PV generation for each GSP region from Sheffield Solar's excellent PV Live Regional API. These estimates go back to 2014.

Behind the scenes, Sheffield Solar maintain a map of the locations of installed solar PV systems. This map changes over time. So, for example, Sheffield Solar's estimates of total PV power generation for each GSP for 2016 were created using their map of what PV was installed in 2016.

If we can feed the PV map (for each GSP, and for each timestep) into our ML models, then our ML models will know which patches of the satellite image to focus on.

It's not the end-of-the-world if we can't use this map. With luck, our models may implicitly learn the location of the PV systems for each GSP, and learn how that map changes over time. But it's almost certainly better to explicitly provide this map as an input to the ML model, to give the ML model less to learn for itself :)

This issue is related to #182

Related issues

This issue is about getting the capacity map into nowcasting_dataset.

Let's discuss how to encode the map for our ML models in https://github.com/openclimatefix/nowcasting_dataloader/issues/24

JackKelly commented 2 years ago

Update from Jamie at Sheffield Solar:

@JamieTaylor-TUOS has very kindly shared PV installed capacity, aggregated by LLSOA, per month as a CSV file (see the email Jamie sent on 7th Oct 2021).

We'll also need to use the LSOA boundary shape file.

Jamie says:

LSOAs vary in size. The biggest ones are in Scotland. The largest is ~1200 km2 !! The median size is 0.418 km2 though, so in general they're pretty small.

So, I think we're good to go! Thanks again, @JamieTaylor-TUOS!