allenai / allennlp

An open-source NLP research library, built on PyTorch.
http://www.allennlp.org
Apache License 2.0
11.74k stars 2.25k forks source link

Incorrect validation in MaxPoolingSpanExtractor #5707

Closed david-waterworth closed 2 years ago

david-waterworth commented 2 years ago

I've been testing the MaxPoolingSpanExtractor - I find in my case it does work significantly better than using EndpointSpanExtractor, in particular, it significantly reduces the number of false positives but it doesn't change the number of false negatives.

However, in order to make it work, I had to create a fork. The problem is the default padding for span indices is -1

https://github.com/allenai/allennlp/blob/1caf0dafa3bc8d0bb309a46e2ccb12f714923260/allennlp/data/fields/span_field.py#L61

yet the MaxPoolingSpanExtractor specifically checks that every span index is between 0 and max sequence length ignoring the mask.

https://github.com/allenai/allennlp/blob/b2eb036e06fbf4e293abb126552aaabc3df91aa1/allennlp/modules/span_extractors/max_pooling_span_extractor.py#L77

Hence for padded sequences this validation always asserts. The correct validation would be to check that only unmasked indices are valid (I just commented it out, I already check in my dataloader) - I would also suggest this should be done in the base class forward rather than one of the implementations.

david-waterworth commented 2 years ago

Also note this line

    adapted_span_indices = torch.tensor(span_indices, device=span_indices.device)

results in

UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than torch.tensor(sourceTensor). adapted_span_indices = torch.tensor(span_indices, device=span_indices.device)

github-actions[bot] commented 2 years ago

This issue is being closed due to lack of activity. If you think it still needs to be addressed, please comment on this thread 👇