Closed niless closed 4 years ago
Currently, the scalar mix is computed outside the BERT embedder, one for each task. That is why all layers are returned initially. If having just one global mix instead of task specific mixes is what you want, you'll have to modify the code in the base model definition.
I was not able to use scalar mix option by changing
combine_layers
tomix
fromall
.mix_embedding
is set to 12. Is there anything else that need to change in the config file?