CLS of which layers to use in Condenser? last layer CLS? sum of last four layers CLS?

luyug / Condenser

EMNLP 2021 - Pre-training architectures for dense retrieval

Apache License 2.0

245 stars 23 forks source link

CLS of which layers to use in Condenser? last layer CLS? sum of last four layers CLS? #13

Closed mahdiabdollahpour closed 2 years ago

mahdiabdollahpour commented 2 years ago

Hi Thanks for the nice repo. After pretraining, Condenser has the same architecture as BERT (condenser heads are removed). Which CLS layers worked best for neural IR? last layer CLS? the sum of the last four layers CLS? ....

luyug commented 2 years ago

We fine-tune the last backbone layer's CLS which is the one passed to the head during pre-training.

luyug commented 2 years ago

Closing for now. Feel free to re-open if you have new questions.