NielsRogge commented 3 years ago

Hi,

I've added LayoutLMv2 and LayoutXLM to HuggingFace Transformers. I've also created several notebooks to fine-tune the model on custom data, as well as to use it for inference. Demo notebooks can be found here. I've split them up according to the different datasets: FUNSD, CORD, DocVQA and RVL-CDIP.

For now, you've got to install Transformers from master to use it: pip install git+https://github.com/huggingface/transformers.git

The big difference with LayoutLM (v1) is that I've now also created a processor called LayoutLMv2Processor. It takes care of all the preprocessing required for the model (i.e. you just give it an image and it returns input_ids, attention_mask, token_type_ids, bbox and image). It uses Tesseract under the hood for OCR. You can also optionally provide your own words and boxes, if you prefer to use your own OCR. All documentation can be found here: https://huggingface.co/transformers/master/model_doc/layoutlmv2.html

Perhaps relevant to the following issues:

333

335

351

329

356

wolfshow commented 3 years ago

Fantastic work!

NielsRogge commented 3 years ago

Thank you :) I've also made a web demo (Gradio) which you can try here: https://huggingface.co/spaces/nielsr/LayoutLMv2-FUNSD

RishabhMaheshwary commented 3 years ago

Hi, how can I fine-tune LayoutLMv2 on FUNSD for relation extraction?

ConorNugent commented 3 years ago

Fantastic!! I have been using the huggingface layoutxlm model for a while but the new API looks super neat. Looking forward to cleaning up my code. Thanks so much Niels

thinklikecomputerscientist commented 3 years ago

Thanks <3

fernandorovai commented 3 years ago

Amazing work!

sz-lcw commented 3 years ago

hi @NielsRogge Thanks to you works! And after I installed the latest version of the transformers (v4.10.0), I couldn't import LayoutLMv2Processor. the error is shown as follow:

ImportError: /lib64/libm.so.6: version `GLIBC_2.29' not found (required by xxxxx/python3.7/site-packages/tokenizers/tokenizers.cpython-37m-x86_64-linux-gnu.so)

how to fixed this problem? Thank you.

lalitr994 commented 3 years ago

Any way to get confidence score in it ?

NielsRogge commented 3 years ago

Any way to get confidence score in it ?

Hi, neural networks (like LayoutLMv2) typically return logits, which are the raw (unnormalized) scores for the classes. For example, if you have a neural network for sequence classification, it will return logits of shape (batch_size, num_classes). To turn these into confidence scores, you can apply a softmax function on them to turn them into probabilities (also called confidence scores).

lalitr994 commented 3 years ago

Any way to get confidence score in it ?

Hi, neural networks (like LayoutLMv2) typically return logits, which are the raw (unnormalized) scores for the classes. For example, if you have a neural network for sequence classification, it will return logits of shape (batch_size, num_classes). To turn these into confidence scores, you can apply a softmax function on them to turn them into probabilities (also called confidence scores).

Thanks, Solved.

ManuelFay commented 3 years ago

Hello Niels, amazing work ! Out of curiosity, will you be adding the LayoutReader as well to the HF ecosystem ? If not, I'll try to do it eventually but can't guarantee I will have the time anytime soon.

7fantasysz commented 3 years ago

Hello Niels, amazing work ! Out of curiosity, will you be adding the LayoutReader as well to the HF ecosystem ? If not, I'll try to do it eventually but can't guarantee I will have the time anytime soon.

It would be much easier to make it available from HF as the input data structure of current layoutreader implementation is not clear.

Anas-Alshaghouri commented 2 years ago

Hello @lalitr994, At what part of the code did you manage for the confidence? Your help is appreciated.

microsoft / unilm

LayoutLMv2 is added to HuggingFace Transformers #417

333

335

351

329

356