openclimatefix / metnet

PyTorch Implementation of Google Research's MetNet and MetNet-2
MIT License
231 stars 47 forks source link

Missing lead time encoding? #28

Closed ValterFallenius closed 2 years ago

ValterFallenius commented 2 years ago

Too small changes depending on lead time The model is able to learn something but the output image seems to change too little depending on the lead time encoding (one-hot from input layer). Here are some examples of the output from 2 different models, one with 60 leadtimes and one with only 8. The left hand plots show the ground truth precipitation in the prediction zone at different lead times, the right hand side shows P(rain_rate>0.2 mm/h) which means I sum the softmax probabillites of all the 127 bins corresponding to rain>0.2mm/h.

60 lead times (5,10,15... 300 min)

3 leadtimes,8 3 leadtimes,2 3 leadtimes

Only 8 leadtimes (15,30,45... 120 minutes): (I changed the cmap to make it clearer that the two plots are not plotting the same thing).

nowcasting1 nowcasting2

For full 60 lead time network check out: w&b, 60 leads

For 8 lead time network check out: w&b, 8 leads

Should be noted that the 8 lead time network has not yet started overfitting.

I have implemented a sampling quality pass that during makes sure each training sample only samples a lead time when there is at least 5 rain pixels.

I am suspecting the axial attention layer again as a bottleneck. Maybe I'm not using it right. We added a positional embedding so that it would know which pixel was where in the input layer, I was wondering if we should add an embedding for which channel it is looking at. Since the model seems to be forgetting which lead time it is handling the ConvGRU spits out 256x28x28 tensor.

Why is it performing so poorly?

ValterFallenius commented 2 years ago

Found a bug today in my code. Above results is trained and tested with only spatial downsampler and temporal encoder, no axial attention... 1 month of work in vain. I'll be back with actual results in a few days.