microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
https://aka.ms/GeneralAI
MIT License
20.2k stars 2.55k forks source link

Demo notebook for LayoutLMForSequenceClassification #287

Open NielsRogge opened 3 years ago

NielsRogge commented 3 years ago

Hey there,

I've recently improved LayoutLM in the HuggingFace Transformers library by adding some more documentation + code examples, a demo notebook that illustrates how to fine-tune LayoutLMForTokenClassification on the FUNSD dataset, some integration tests that verify whether the implementation in HuggingFace Transformers gives the same output tensors on the same input data as the original implementation, and finally LayoutLMForSequenceClassification. My PR was merged yesterday :)

However, now I'm also preparing a notebook that illustrates how to fine-tune LayoutLMForSequenceClassification on (a small subset of) the RVL-CDIP dataset. However, it doesn't seem to be able to overfit the tiny subset (I have 16 images per class, so as there are 16 labels I have 256 training examples). You can run it here: https://colab.research.google.com/drive/1DUpTi2aL64AuIJ_9g6dGgKfltEEFqQbt?usp=sharing

Any feedback is greatly appreciated!

NielsRogge commented 3 years ago

Btw, the demo notebook for fine-tuning LayoutLMForTokenClassification on the FUNSD dataset can be found here.

aritzLizoain commented 3 years ago

Hi @NielsRogge, thanks for providing the notebooks!

I am working with your demo notebook for fine-tuning LayoutLMForTokenClassification. How can we save the fine-tuned model in order to use it in for inference in the future? I don't see any output file after fine-tuning.

Thank you in advance!

NielsRogge commented 3 years ago

Hi! In HuggingFace, a model can be saved using model.save_pretrained("name-of-your-directory"). This will save both the weights (pytorch_model.bin file), as well as the configuration (config.json) to the directory.

aritzLizoain commented 3 years ago

Thank you for your prompt reply!

monuminu commented 3 years ago

@NielsRogge Cant thank you enough . It really helped . I took your code and implemented without installing unlim basically pure transformers . Would love to add to your repo my notebook .

VishnuGopireddy commented 3 years ago

@NielsRogge I am getting "PicklingError: Can't pickle <class 'layoutlm.data.funsd.InputFeatures'>: import of module 'layoutlm.data.funsd' failed" while preparing a dataloader for FUNSD dataset. Can you please help?

NielsRogge commented 3 years ago

Hi @monuminu @VishnuGopireddy I have a new notebook that adds visual features from a Resnet-101 backbone in addition to the text + layout features. You can find it here: https://github.com/NielsRogge/Transformers-Tutorials/blob/master/LayoutLM/Add_image_embeddings_to_LayoutLM.ipynb

It relies entirely on HuggingFace Transformers, no need for the unilm repo anymore :)

VishnuGopireddy commented 3 years ago

@NielsRogge Awesome!!! Woking fine. Big thanks.

monuminu commented 3 years ago

@NielsRogge amazing work !

VishnuGopireddy commented 3 years ago

Hi @NielsRogge Nice work, I am able to get all the tags for each word. Is there any way/ approach to get correspondence between tags? I mean mapping question to the answer. Thanks...

nkrot commented 3 years ago

Hi @NielsRogge, Thanks for the notebook, it is very instructive! And it would be even more useful if it contained an example showing how to use the model fine-tuned.

vinayakk1094 commented 3 years ago

The link seems to be broken - 'Sorry, the file you have requested does not exist.'

NielsRogge commented 3 years ago

@vinayakk1094 hi, all tutorials can be found here (both for LayoutLM and LayoutLMv2): https://github.com/NielsRogge/Transformers-Tutorials