fine-tuning ViLT for MLM task with a new dataset

Hi. Thanks for providing the code to such a great work. I am new to language models and I apologize for maybe asking trivial questions.

I am wondering if it is possible to fine-tune the model for MLM on a new/different dataset. Basically I want to have a model that can predict the [MASK] specific to a certain dataset (with custom text and images). Could you please share how to do this?

Thanks in advance for your time and help. Best regards.

dandelin / ViLT

fine-tuning ViLT for MLM task with a new dataset #79