{"mask"} automatically assigns <extra_id_0>, which conflicts with the task of masked filling

I want to do prompt tuning for a masked-fill-based T5 model, which has the input format like this:

test_dataset = [
    InputExample(text_a="The quick <extra_id_0> fox <extra_id_1> over the lazy dog", tgt_text="<extra_id_0> brown <extra_id_1> jumps <extra_id_2>"),
    InputExample(text_a="The Capital city of China is <extra_id_0>, which has a <extra_id_1> of 20 million", tgt_text="<extra_id_0> Beijing <extra_id_1> population <extra_id_2>")
]

if I use the template similar to that given by 2.1_conditional_generation.py, that is:

template = ManualTemplate(t5tokenizer, '{"placeholder": "text_a"} {"special": "<eos>"} {"mask"}')

it will automatically assign an <extra_id_0>at the corresponding position of {"mask"}, splitting the original sentence and target sentence with special token </s>, which results in duplicate <extra_id_0>s in input sentence, just as follows:

The Capital city of China is, which has a of 20 million

I know it is possible to manually add 1 to each extra_id in my dataset, but is it possible to ONLY use the source sentence as input and avoid automatically adding extra ids?

thunlp / OpenPrompt

{"mask"} automatically assigns <extra_id_0>, which conflicts with the task of masked filling #234