Train a system with a classification head on top of a PLMs, possibly drawing also from lower layers, without fine-tuning the PLMs parameters, i.e. using a frozen encoder. This would tell us whether fine-tuning of the PLM is really necessary or continued pre-training on target domain data is sufficient.
Train a system with a classification head on top of a PLMs, possibly drawing also from lower layers, without fine-tuning the PLMs parameters, i.e. using a frozen encoder. This would tell us whether fine-tuning of the PLM is really necessary or continued pre-training on target domain data is sufficient.