openclimatefix / ocf_datapipes

OCF's DataPipe based dataloader for training and inference
MIT License
13 stars 11 forks source link

Option to select only the cloest PV systems #102

Open peterdudfield opened 1 year ago

peterdudfield commented 1 year ago

Detailed Description

When choosing PV sytesm for a example, option to only choose the closest ones

Context

if there are 1000 system in the system, and we want to only get 16 systems, maybe its best to get the PV systems that are closest, on the other hand, having a random selection means we could have lots of PV systems from a wide range

Possible Implementation

put something when we sleect the pv systems in here

peterdudfield commented 1 year ago

Would be interested in your thoughts @JackKelly and @jacobbieker

jacobbieker commented 1 year ago

I would probably be inclined to leave it for more random systems. If you want only PV systems in a certain area, then you can chain together cropping a spatial area of the PV data, and then choosing N PV systems. It might be worth having a separate one that chooses the N closest systems, but I think that would just be almost the same.

peterdudfield commented 1 year ago

Yea, i thought it would just be an option in that data pipe, either random or closest, like method='random' or closest

jacobbieker commented 1 year ago

Ah yeah, in which case, yeah, adding it makes sense to me then, we probably want to get the N closest ones for like pseudolabelling as well

peterdudfield commented 1 year ago

ok, ill give it a go