titipata / detecting-scientific-claim

Extracting scientific claims from biomedical abstracts (powered by AllenNLP)
140 stars 20 forks source link

Wrong masking in sentences to vector encoding? #16

Closed daniel-acuna closed 6 years ago

daniel-acuna commented 6 years ago

Here, it seems that you want to extract the mask per sentence rather than across sentences:

https://github.com/titipata/detecting-scientific-claim/blob/fc1e00a3256627b5f75011a043055e1c56233661/discourse/models/discourse_crf_model.py#L62-L68

This can be done by changing line 62 to

sentence_masks = util.get_text_field_mask(sentences, 1)

and line 68 to

encoded_sentences.append(self.sentence_encoder(embedded_sentences[:, i, :, :], sentence_masks[i]))
titipata commented 6 years ago

Oh yeah, you're right! I will fix that accordingly.