jpWang / LiLT

Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding (ACL 2022)
MIT License
342 stars 40 forks source link

Improve relation extraction #37

Open tuongtranegs opened 1 year ago

tuongtranegs commented 1 year ago

Hi @jpWang , thanks for your repo,

I have used it for my project: extract keys and values in complicated layout document types

  1. The NER model looks good
  2. The RE model does not work well Examples: The outputs of RE model: Q1 -> A2, Q2->A2, Q3 -> A1

I have an idea to improve the RE model as below: As I know that the RE is based on the semantics of language to learn -> relation classification From my point of view, they can be learned on position (position embedding) + semantics of language to improve relation classification

To take the good result as bellow: image

What do you think about my idea?

logan-markewich commented 1 year ago

Have you looked at the FUNSD dataset? The original version of the dataset contains relation labels, relating questions (i.e. field names) to answers (i.e field values). This seems to be similar to your idea here.

tuongtranegs commented 1 year ago

@logan-markewich, No, I want to use the boxes position of entities to model learning while the model here uses only linking about language

logan-markewich commented 1 year ago

LiLT already uses boxes to learn (and the FUNSD dataset has boxes as well), so I'm not sure what you mean 🤔

tuongtranegs commented 1 year ago

@logan-markewich, For XFUND dataset, There are two models to learn: SER(need position) and RE(only relation and not position)

sudheer997 commented 1 year ago

Hi there @tuongtranegs @logan-markewich @nielsrogge @NielsRogge,

I'm wondering if Hugging Face's Transformers library includes support for relation extraction using LiLT. I'm interested in fine-tuning a pre-trained model for relation extraction, but I'm not sure if the library provides this functionality.

Could someone please let me know if relation extraction is supported in Hugging Face's Transformers library, and if so, which pre-trained models are recommended for this task?

Thanks!

logan-markewich commented 1 year ago

@sudheer997 relation extraction isn't really supported by huggingface. If you want to support it, I suggest using the LiLT model and adding a relation extraction head to it.

lalitr994 commented 1 year ago

@tuongtranegs can you please share the inference code? I am getting an error while initializing the tokenizer.