Open djanloo opened 2 years ago
It seems that the encoder (and a considerable part of lstm) becomes useless only if we train the whole net freezing the subnets.
The image shows the inner weights of the first dense layer after subnets concatenation.
For the upper two images the subnets are hold untrainable while the "concatenation" of the net is trained. For the lower two, however, the encoder and the lstm are loaded pre-trained but trainable.
The most impressive thing is the clearness shown by the freeze-weights image: almost all information is neglected, only two cells of the lstm net held significant values. The best operation that the concatenation can do is to select those two layers and subtract them.
Actually only one cell of the lstm is useful. Say that the value of the cell is val, the layer outputs 16 values: approx (0,0,0,.., val, 0,.., -val, 0,..)
Trained LstmEnc with frozen LSTM
With a bigger encoder the selection of a subnet does not happen. Now we should find a good way to display this stuff. The two figures represent the weights of the concatenate layer. Since the encoder last layer has shape (10348, ) due to flatten the matrix is a little big.
This should not be so though.
The LSTM subnet alone reaches a resolution of approx 37 m (RMSE), which is similar to the best value of resolution we achieved.
On the other hand the encoder gets stuck at a resolution of 54 m (RMSE), that suggests that the encoder is not sufficiently informative to be taken into account in the final estimation.
We should check the biases of the concatenate layer to confirm or dismiss.