Closed david-waterworth closed 2 years ago
Also note this line
adapted_span_indices = torch.tensor(span_indices, device=span_indices.device)
results in
UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than torch.tensor(sourceTensor). adapted_span_indices = torch.tensor(span_indices, device=span_indices.device)
This issue is being closed due to lack of activity. If you think it still needs to be addressed, please comment on this thread 👇
I've been testing the
MaxPoolingSpanExtractor
- I find in my case it does work significantly better than usingEndpointSpanExtractor
, in particular, it significantly reduces the number of false positives but it doesn't change the number of false negatives.However, in order to make it work, I had to create a fork. The problem is the default padding for span indices is -1
https://github.com/allenai/allennlp/blob/1caf0dafa3bc8d0bb309a46e2ccb12f714923260/allennlp/data/fields/span_field.py#L61
yet the
MaxPoolingSpanExtractor
specifically checks that every span index is between 0 and max sequence length ignoring the mask.https://github.com/allenai/allennlp/blob/b2eb036e06fbf4e293abb126552aaabc3df91aa1/allennlp/modules/span_extractors/max_pooling_span_extractor.py#L77
Hence for padded sequences this validation always asserts. The correct validation would be to check that only unmasked indices are valid (I just commented it out, I already check in my dataloader) - I would also suggest this should be done in the base class forward rather than one of the implementations.