microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
https://aka.ms/GeneralAI
MIT License
19.63k stars 2.51k forks source link

Question. LayoutLMv3. How to pretrain from scratch? #1042

Open darkoob12 opened 1 year ago

darkoob12 commented 1 year ago

Hello and thank you for sharing your code. I am working on a document understanding and wanted to test LayoutLMv3 but I need to retrain it using my own data. I could not find any guide for pretraining from scratch. Also I could not finding any code that uses the three self-supervision tasks specified in the paper. So my question is: are the source codes for pretraining published? if so, where can I find them?

Thanks.

chriscpy commented 1 year ago

Hello and thank you for sharing your code. I am working on a document understanding and wanted to test LayoutLMv3 but I need to retrain it using my own data. I could not find any guide for pretraining from scratch. Also I could not finding any code that uses the three self-supervision tasks specified in the paper. So my question is: are the source codes for pretraining published? if so, where can I find them?

Thanks.

It seems that the pretraing code has not been released

nobody4t commented 1 year ago

Hello and thank you for sharing your code. I am working on a document understanding and wanted to test LayoutLMv3 but I need to retrain it using my own data. I could not find any guide for pretraining from scratch. Also I could not finding any code that uses the three self-supervision tasks specified in the paper. So my question is: are the source codes for pretraining published? if so, where can I find them?

Thanks.

have you every figure out it?

darkoob12 commented 1 year ago

@dongwangdw
I think there is no train code. You have to write the self-supervision heads your self. I have not started yet. I am collecting English data to try fine-tuning. If it works I will go for a pre-train from scratch using multi-lingual data.

arundprabhu commented 8 months ago

HI @darkoob12 Were you able to reproduce the MIM objective?