Closed moazshorbagy closed 4 years ago
Hi,
Though it is allowed in TensorFlow implementation, because of the way the computations are designed in this layer, it is not possible. In short the issue is that, when there's two None
dimensions (batch and time dimensions), I cannot use any reshaping within the attention layer as it is currently using.
I'll need to do some research on this topic, there might be a way to achieve this, but with degraded performance.
@moazshorbagy This support is available now.
@thushv89 Thank you, that's awesome.
The problem Tensorflow has an awesome feature where one can use None as sequence length, this allows using variable sequence lengths across different batches. The code of the AttentionLayer gives an error when trying to use None as sequence length.
The error trace