microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
https://aka.ms/GeneralAI
MIT License
19.62k stars 2.5k forks source link

pretrain LayoutLM v2 on new dataset #603

Open qifeishen opened 2 years ago

qifeishen commented 2 years ago

Dear Authors, thanks for the great work. Do you plan to release the pretraining source code for LayoutLM v2?

bonejay commented 2 years ago

Yeah it would be a great help. Kind of tiresome that none of the document transformers (TILT, StructuralLM, layoutlm,StrucTexT) have release their pretraining code, although it is the most important factor to their performance.

zzcgithub commented 2 years ago

I hope so

CheungZeeCn commented 2 years ago

+1

zzcgithub commented 2 years ago

+2

sudhirpol522 commented 1 year ago

+3

rr191211 commented 1 year ago

I wonder when the source code will be released? Thanks

ivsanro1 commented 1 year ago

Yeah it would be a great help. Kind of tiresome that none of the document transformers (TILT, StructuralLM, layoutlm,StrucTexT) have release their pretraining code, although it is the most important factor to their performance.

Just a guess, but they do this on purpose to force you to use their pre-trained models, and also to have control over their models and implementation. They never release pretraining codes (also for MarkupLM)