I understand that the model returns a set of layers, but other models return different numbers of layers. How did the 8th layer get picked as a default?
The default model is set to "bert-base-multilingual-cased" and the best layer for this model is layer=8. Check out Figure 4 in the paper for more details.
Why is the default layer
8
?I understand that the model returns a set of layers, but other models return different numbers of layers. How did the 8th layer get picked as a default?
https://github.com/cisnlp/simalign/blob/249a7f331814d18bef7f8ff69f8474a91568c2c2/simalign/simalign.py#L26