fenchri / edge-oriented-graph

Source code for the EMNLP 2019 paper: "Connecting the Dots: Document-level Relation Extraction with Edge-oriented Graphs"
Other
146 stars 17 forks source link

A new task with more entity types and relation types #21

Closed crystal-xu closed 4 years ago

crystal-xu commented 4 years ago

Thanks for your great work.

I have been trying your model in a new task where there are 9 entity types and 97 relation types. I have done the following modifications:

Is there anything else I need to modify? I am encountering a "nan" issue in the graph layer but not sure whether it is because I forgot something important.

Thanks very much!

fenchri commented 4 years ago

Hi, thank your for your interest!

The steps you describe seem good to me. Is the dataset concept-level or mention-level? Would you also mind providing some additional information regarding the error so I can figure out what is causing it?

crystal-xu commented 4 years ago

Hi, Thanks for your quick reply.

The dataset has both concept-level and mention-level annotations. I have converted my dataset to the same format as yours and intend to do concept-level RE.

I have done some troubleshooting for the issue. I notice that if I remove the attention layer, the issue disappeared. But once I keep the attention layer, there will be "nan" values in the "m_cntx" matrix which seems to cause the problem. So, maybe the root cause is the attention layer.

Have you ever encounter similar issue before? I really appreciate your help.

fenchri commented 4 years ago

Hi, thanks for the additional info!

Not really it was never an issue for me.

Actually, this error can be caused when calculating the weights in the attention layer. In this layer, padded words and mentions that are used as queries are masked. If the sentence has only 2 mentions and no other words the vector is filled with -inf and softmax returns only zeros.

Could you check the example sentence and mentions where this error first appears and see what it looks like?

crystal-xu commented 4 years ago

Hi, thanks for your reply.

I have printed out the matrices where this error first appears. I notice that it seems to happen because of what you have mentioned, i.e. all words in the sentences are masked, and the vector is filled with -inf. However, the softmax just returns nan. Would you mind telling me if the attention layer should return all zeros in this case? If so, maybe manually replacing such a nan vector with a zero vector might work?

Thanks very much!

fenchri commented 4 years ago

Hi,

The reason that softmax returns nan is because during computation the denominator is invalid (-inf). In this case, we can do a naive fix and when a vector is full of -inf, replace it with ones, compute softmax and the replace the resulted weights with zeros.

Could you replace line 64 here with the following? Also, please check that your resulted weight vector is full of zeros after this process. If that works for you, I'll make sure to update the code with this fix.

alpha = torch.where(torch.isinf(alpha).all(dim=2, keepdim=True), torch.full_like(alpha, 1.0), alpha)
alpha = self.softmax(alpha)
alpha = torch.where(torch.eq(alpha, 1/alpha.shape[2]).all(dim=2, keepdim=True), torch.zeros_like(alpha), alpha)
crystal-xu commented 4 years ago

Hi,

I have tried your update and dumped the results. It works! Great Job! Thank you so much for your help!

BTW, I think it would be very useful if you could adapt your model to the multi-GPU scenario. And I have been doing this in my way. Thanks for your efforts.

fenchri commented 4 years ago

No problem, happy to help! Sure, I will add this in my todo list and hopefully I will be able to this soon enough :)

I am closing this issue, since it seems resolved. Good luck!