gesistsa / grafzahl

🧛 fine-tuning Transformers for text data from within R
https://gesistsa.github.io/grafzahl/
GNU General Public License v3.0
41 stars 2 forks source link

Layer extracted by grafzahl #27

Open LuigiC72 opened 6 months ago

LuigiC72 commented 6 months ago

Given the discussion about which layer keeping as a token's representation in a down-streaming analysis (Jawahar et al., 2019; Ethayarajh, 2019) when for example using a pre-trained bert model, I was wondering if you are considering to allow to the user the possibility to select one specific layer when fine-tuning a Transformer via grafzahl. At the moment, which layer is consider when for example I specify [model_name = "bert-base-uncased"]? Thanks for your great package!

chainsawriot commented 6 months ago

By default, grafzahl uses almost the same default as the underlying simpletransformers, i.e. no freezing and all layers might get finetuned.

If you really want to freeze some layers, it is possible to do that with simpletransformers; but unfortunately, not possible with grafzahl. If you want to customize the finetuning at the layer level, I think you would be better off using simpletransformers or even transformers.