It's standard to just ignore masking when using a CNN on word / character sequences, because the max pooling can effectively ignore the padding tokens, anyway. It'd be interesting to actually verify that this is true, by implementing masking for our CNN and seeing what difference it makes, if any.
Pretty low priority, though. Just getting a thought out of my head and into an issue tracker.
It's standard to just ignore masking when using a CNN on word / character sequences, because the max pooling can effectively ignore the padding tokens, anyway. It'd be interesting to actually verify that this is true, by implementing masking for our CNN and seeing what difference it makes, if any.
Pretty low priority, though. Just getting a thought out of my head and into an issue tracker.