mandarjoshi90 / coref

BERT for Coreference Resolution
Apache License 2.0
441 stars 92 forks source link

Asserion Error coming in predict.py #27

Closed Harika-3196 closed 4 years ago

Harika-3196 commented 4 years ago

So when i pass a text in the required json format.Its predicting for some examples properly.but it throws assertion error for other examples. And i found out its due to larger length of text.cant we use span bert coref for a paragraph having large sentences having total more than 512 words .

InvalidArgumentError (see above for traceback): assertion failed: [] [Condition x <= y did not hold element-wise:x (bert/embeddings/strided_slice_3:0) = ] [679] [y (bert/embeddings/assert_less_equal/y:0) = ] [512] [[Node: bert/embeddings/assert_less_equal/Assert/Assert = Assert[T=[DT_STRING, DT_STRING, DT_INT32, DT_STRING, DT_INT32], summarize=3, _device="/job:localhost/replica:0/task:0/device:CPU:0"](bert/embeddings/assert_less_equal/All, bert/embeddings/assert_less_equal/Assert/Assert/data_0, bert/embeddings/assert_less_equal/Assert/Assert/data_1, bert/embeddings/strided_slice_3, bert/embeddings/assert_less_equal/Assert/Assert/data_3, bert/embeddings/assert_less_equal/y)]]

mandarjoshi90 commented 4 years ago

That's right. Most BERT-based models have that limitation. In this case, you should chop the document into independent chunks of 512 word pieces.

Harika-3196 commented 4 years ago

I tried the same way by chopping into chunks. But contexts are getting apart in different chunks.

Anyway.Thanks for the reply and time @mandarjoshi90

mandarjoshi90 commented 4 years ago

I'm not quite sure I understand the exact problem. Could you please explain with an example? It might also be helpful to look at minimize.py.

Harika-3196 commented 4 years ago

I'm not quite sure I understand the exact problem. Could you please explain with an example? It might also be helpful to look at minimize.py.

Thanks @mandarjoshi90, yes you got my question. As you suggested when we split the very long text into chunks. Say entity [noun] is present in one chunk and pronoun is in another chunk.coref model fails to address the pronoun to its actual entity as they are in two different chunks . Is my understanding correct?

Example I have splitted my text into text1 and text2 chunks as my original text is a very long paragraph.

`

text 1

"' the shires of toothbrush today outcome with me. yes, that would be great. how about sunshade? those trophies? i know. if i even Odonata palm coming at cod. I really like the pink one, and I love the blue one . what should we get then now? okay, i what you should have a sense to the dinette for . i know young is in every two out'","' the shires of tooth brush today outcome with me. yes, that would be great . how about sunshade? those trophies? i know. if i even Odonata palm coming at cod. I really like the pink one, and I love the blue one . what should we get then now? okay, i what you should have a sense to the dinette for . a i know young is in every two out '" `

text2

"""he was going, guys , what's the vet for thirteen years at Regina ben? you may review and unboxing of a Colgate three toothbrush. so to open it , you just pull like the reading part i just say , good , very difficult , extreme difficulty. so its just the us . but you think of it the one at one , the holder . like this potentiation restored toothpaste . i use it something all i have while i'm brushing my teeth . so , yeah , works what i do that do it that way you hold it like this . and so yeah , i really like it was took a part one . if you guys don't know what i'm doing , please like comment it described to much and of more random content . and pleasures vitali appreciate a thick so much logic . have a nice day""",""" he was going , guys , what ' s the vtt for thirteen years at regina ben ? guys my review and unboxing of a colgate three toothbrush .

Now my question is the pronoun say he at the very beginning in text2 can't be co referred to the context in text1.

Note Ignore the example context