tom-andersson / icenet-paper

Code associated with the paper 'Seasonal Arctic sea ice forecasting with probabilistic deep learning'
GNU General Public License v3.0
90 stars 19 forks source link

question about models.py #3

Closed ZHUYMGeo closed 2 years ago

ZHUYMGeo commented 2 years ago

Hi: Can the model file in (icenet->model.py) be considered as a Spatio-temporal prediction model?

tom-andersson commented 2 years ago

Hi @ZHUYMGeo, thanks for the question. The short answer is yes, IceNet is a model that tackles a spatiotemporal prediction problem.

However, the long answer is that the model itself is not a spatiotemporal model. The UNet architecture in icenet/models.py does not explicitly model the time dimension, it only explicitly models the spatial dimensions. The input batch of data is a tensor of shape (n_samples, n_rows, n_cols, n_channels). Different environmental variables and times are interlaced along the channel dimension. For example: surface temperature at lag 0, surface temperature at lag 1, ..., sea ice concentration at lag 0, etc, are all concatenated along the channel dimension. And the model's six output forecast months are essentially treated as six separate predictions; there is no embedded notion that the forecast month 3 output is followed by the forecast month 4 output. However, through training, IceNet learns that the lag-0 input channels are more important, and that they are most important for the short-term 1-month forecast. This implies that the model has learned to handle time appropriately (but only by observing the spatial relationships between the input and output data). See the interpretability section of our paper for more details.

There are deep learning models that explicitly model the temporal dimension of the data, for example the ConvLSTM. In those models the input tensor would have a shape like (n_samples, n_rows, n_cols, n_variables, n_times), where the UNet's channel dimension has been separated out into a variable dimesion (eg sea ice, temperature, pressure, ...) and a time dimension. These models are harder to train than simpler UNet models, however.

I'll close this issue, but let me know if you have any further questions.