BlstmEncoder allow_pool_last, change default to True?

It's only really relevant for small encoders, e.g. with 2 layers, which is usually when it is used as a frontend, e.g. for Conformer. Although this also might happen during pretraining of a larger BLSTM encoder.

In that case, having one pooling of size 6 is usually worse than having two pools of size 2 and 3. Or maybe it depends what comes afterwards, but this is the case when this is used as a frontend for a Conformer. So in this use case, the current default is suboptimal.

I'm not really sure what's expected.

rwth-i6 / returnn_common

BlstmEncoder allow_pool_last, change default to True? #246