Open djanloo opened 2 years ago
Found a major issue in training the encoder subnet since apparently keras can train a network with 4 output values with one target value. Adding a final 1-dense layer would fix the problem.
Does this need a comment? ... (LSTM)
I'm starting to think that what you've said in the first comment is correct. Maybe optimising the modules separately is not improving the performance of the whole network.
@luciapapalini LSTM with the last 1-Dense is a Bugatti
Now with 512 -> 256 -> ... -> 4 seems to learn a kind of trend.
Since the LstmEncoder is a compound net, we should test the resolution of each subnet to ensure a "dividi et impera" workflow.
We should not take for granted that optimizing a part of the net should lead to the whole net improving, but we'll never know if we don't try.