It appears that config/pretrain-alldata-base.json is not your paper pretraining configuration. There is no cls_concat setting in this configuration file, so it uses the default value. As a result, unlike your paper, this configuration uses MLM instead of VMLM. Could you please provide a correct configuration that reproduces your results?
It appears that config/pretrain-alldata-base.json is not your paper pretraining configuration. There is no cls_concat setting in this configuration file, so it uses the default value. As a result, unlike your paper, this configuration uses MLM instead of VMLM. Could you please provide a correct configuration that reproduces your results?