Open nishchay47b opened 3 years ago
Hi,
What do you mean by link prediction? LayoutLM can be used to process documents like invoices and receipts. It can extract information from them, and classify them.
by link prediction I mean, the connection between the answer tokens and the question tokens. If you see in the ground truth annotations of the FUNSD dataset, there is a field called linking which associates the words by id, i.e what exactly are the answer tokens to particular question tokens and not just the classification of tokens. The second point in this issue, they have given some answers but I didn't get that and I was hoping to get some understanding from you.
Thanks
Oh that's really cool, I didn't know that. Of course, linking the answers to the questions is important.
Given that LayoutLM is capable of identifying all questions and answers from the text, maybe you can add a layer on top of that that takes an answer and question as input, and classifies whether or not both are linked (binary classification).
Hi, Thank you for such amazing notebooks and sorry if I am missing something basic. Are you doing link prediction in the layoutlm notebooks? If not, any ideas on how it can be done and does the original implementation do that, or is it only token classification?
@nishchay47b In the original layoutlm paper, it was stated entity linkage was beyond its scope. I agree it would be interesting to be able to predict this since the FUNSD annotation format already includes linkage information. I hope to explore this at some point in the future. I will share here if I do.
@cydal that would be great, they have given some hints on how they approced this problem """Semantic linking is the task of predicting the relations between semantic entities. In this work, we focus on the semantic labeling task, while semantic linking is out of the scope. To fine-tune LayoutLM on this task, we treat semantic labeling as a sequence labeling problem. We pass the final representation into a linear layer followed by a softmax layer to predict the label of each token. The model is trained for 100 epochs with a batch size of 16 and a learning rate of 5e-5"""
but with LayoutLMv2 out now that has code samples related to relation extraction I think it would be better to explore that. Although the numbers doesn't seem very encouraging and inference on custom data is not very clear.
Hi, Thank you for such amazing notebooks and sorry if I am missing something basic. Are you doing link prediction in the layoutlm notebooks? If not, any ideas on how it can be done and does the original implementation do that, or is it only token classification?