ECMWFCode4Earth / ml_drought

Machine learning to better predict and understand drought. Moving github.com/ml-clim
https://ml-clim.github.io/drought-prediction/
90 stars 18 forks source link

Review - Models #99

Closed cvitolo closed 4 years ago

cvitolo commented 5 years ago
tommylees112 commented 5 years ago

We see that you implemented five different models: (1) linear regression, (2) Base Neural Network, (3) Linear Neural Network, (4) Recurrent Neural Network and (5) Entity Aware LSTM You have nice graphics on the model functionality. However, without a textual description, the diagrams are quite cryptic. It would be very interesting if you could elaborate as part of the documentation why you have chosen these five models as well as your opinion on each model’s advantages and disadvantages. How do the models perform predicting drought?

The initial results showed that adding more data improved performance suggesting that the simple input datasets that we began with were not enough to capture the required complexity of agricultural drought.

We chose the Linear Regression as a simple baseline that we expect our more complicated models to outperform. Linear Regression can be an extremely useful tool, and if we are unable to beat its performance then we cannot justify the extra complexity from the neural networks.

The BaseNeuralNetwork is a parent class that defines behaviour and attributes consistent to all pytorch neural networks. It’s not a model itself it just has behaviour that the child classes inherit.

The Linear Neural Network is a neural network composed of linear layers. These linear layers have a REctified Linear Unit (RELU) activation function, batch normalisation and dropout. We have the ability to add different numbers and sizes of layers.

The Recurrent Neural Network incorporates the temporal structure of recurrence into the network architecture. It is an implementation of a Long Short Term Memory Network (LSTM) where the output of one cell feeds as the input to the next cell with each cell producing a prediction at a particular timestep and having a memory of previous conditions.

The Entity Aware LSTM was recently introduced in [the paper here](). It was initially applied to rainfall-runoff modelling and incorporated the fact that some variables change over time (rainfall, temperature) whereas others are more static (soil type, topography). This separation of dynamic and static variables is also an interesting way of approaching drought because we also have variables that change at each timestep and variables that are constant across all timesteps that the model sees.

It would be nice to have an overview / comparison of the performance of the five models implemented

Would you like these to be in terms of architecture (like in the notebooks/docs/04_models.ipynb notebook) or in terms of performance? The performance visualisations are ongoing.

How are your models taking into account the spatial correlations amongst variables?

We currently take into account spatial correlations in a rather crude way. We have a parameter defining the number of surrounding_pixels. We have currently set the parameter to 1, which means we get all 9 surrounding pixels that are 1 pixel away from our target pixel. This appends the values for variables in these pixels as columns in the X matrix. In this way we incorporate the information in the surrounding pixels into the model predictions.

In order to understand the optimum number of surrounding_pixels we would have to test different values for this parameter and see at what level the performance no longer improves.

It is worth noting that the spatial relationships of these pixels are not explicitly communicated to the models - this is room for further experimentation, perhaps using a CNN.