`DataCollatorForPrivateCausalLanguageModeling` hardcoded to `mlm=False`

microsoft / dp-transformers

Differentially-private transformers using HuggingFace and Opacus

MIT License

109 stars 20 forks source link

It's totally safe to turn it to True. It's just in our examples we generally worked with unidirectional language models and we constructed this DataCollator for "Causal Language Modeling", hence mlm parameter is hardcoded to False. But you can totally use mlm=True for Masked Language Modeling. You may not even need this particular DataCollator if you don't have the similar issue with position_ids. I guess we could have been more comprehensive by calling it DataCollatorForPrivateLanguageModeling and let the user input the parameter mlm :)

microsoft / dp-transformers

`DataCollatorForPrivateCausalLanguageModeling` hardcoded to `mlm=False` #42