Open jacobbieker opened 3 years ago
To help, should also add losses for optical flow, like end point error (e2e), one example is here: https://github.com/NVIDIA/flownet2-pytorch/blob/master/losses.py
Easiest way to obtain the clean base image could be just take the average of all of the days for a given place. Another option would be to use the binary cloud mask, and only use the pixel in the average if it is "cloud free". While it could still have some influence from sub-pixel clouds, it'd probably be close enough.
https://arxiv.org/pdf/1511.05440.pdf%5D Also goes into only using loss on parts of the image above a certain optical flow threshold
Some very hand-wavy ideas about subtracting 'cloudy pixels' from 'background' here: https://github.com/openclimatefix/predict_pv_yield/issues/17
I love these ideas!
But, before diving into data augmentation, is there strong reason to believe the models are overfitting? (although I remember the Perceiver paper emphasising that the Perceiver likes to overfit more than most deep learning models!)
I don't think the models are overfitting actually, from what I'm seeing they aren't yet at least. So yeah, can definitely leave that for later in case they start to overfit!
As mention in #85 one pre-training idea is to create a flow dataset to pre-train on using clouds. We would need simulated flow, and would want to have realistic clouds in all spectral channels. Easiest way to do that, I think, would be to take real clouds and then crop/paste them over the base ground image and move them to create it. So, to do this, there needs to be