pretrain LayoutLM v2 on new dataset

microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

https://aka.ms/GeneralAI

MIT License

19.62k stars 2.5k forks source link

pretrain LayoutLM v2 on new dataset #603

Open qifeishen opened 2 years ago

qifeishen commented 2 years ago

Dear Authors, thanks for the great work. Do you plan to release the pretraining source code for LayoutLM v2?

bonejay commented 2 years ago

Yeah it would be a great help. Kind of tiresome that none of the document transformers (TILT, StructuralLM, layoutlm,StrucTexT) have release their pretraining code, although it is the most important factor to their performance.

zzcgithub commented 2 years ago

I hope so

CheungZeeCn commented 2 years ago

zzcgithub commented 2 years ago

sudhirpol522 commented 1 year ago

rr191211 commented 1 year ago

I wonder when the source code will be released? Thanks

ivsanro1 commented 1 year ago

Yeah it would be a great help. Kind of tiresome that none of the document transformers (TILT, StructuralLM, layoutlm,StrucTexT) have release their pretraining code, although it is the most important factor to their performance.

Just a guess, but they do this on purpose to force you to use their pre-trained models, and also to have control over their models and implementation. They never release pretraining codes (also for MarkupLM)