Support multiple MASK tokens in LM inspector

inspect_lm_huggingface.py now has an option to repeat [MASK] tokens but this doesn't work due to https://github.com/huggingface/transformers/issues/3609

We could implement our own solution using AutoModelWithLMHead, following suggestions in my comment in the above transformer issue, or implement a solution inside the transformer library and make a PR.

Also look at FitBERT, SpanBERT and other tools that may already have implemented this.

Meng et al. 2022 Rewire-then-Probe: A Contrastive Recipe for Probing Biomedical Knowledge of Pre-trained Language Models propose a workaround for obtaining multi-token answers from BERT.

Edit:

There is a PR https://github.com/huggingface/transformers/pull/10222
More concrete idea
How do others address the problem, e.g. https://github.com/marcotcr/checklist ?

jbrry / Irish-BERT

Support multiple MASK tokens in LM inspector #63