chengchingwen / Transformers.jl

Julia Implementation of Transformer models
MIT License
523 stars 74 forks source link

Create HuggingFace transformer model #137

Closed MNLubov closed 1 year ago

MNLubov commented 1 year ago

Hi, @chengchingwen

In previous version of the Transformers, there were a possibility to create a transformer of the HuggingFace type with the pre-defined parameters:

config = Transformers.HuggingFace.HGFBertConfig(;
  vocab_size=length(test_wordpiece.vocab),
  hidden_size=64,
  num_hidden_layers=1,
  num_attention_heads=1
)
model = Transformers.HuggingFace.HGFBertModel(config)

In the latest version of the Transformers package, the old way to create a transformer of the HuggingFace type doesn't work. Is there a way to do something like this in the latest version of the Transformers package? Thank you.

chengchingwen commented 1 year ago

For model, the api is now switching to load_model (even though we are not loading any pretrained weights). The load_model can take a type and a config to call the corresponding construct method. For example, model = load_model(HGFBertModel, config).

The configuration is a little tricky, I was assuming people to modify an existing config object because currently we can't train our own tokenizer. If you already have the config object, you can call HGFConfig to create a new object with overwritten values like: https://github.com/chengchingwen/Transformers.jl/blob/7cd3d96047097bf0e4ecf46f0e079309ed5856b0/example/BERT/mnli/train.jl#L25

If you don't have the config object, another way is to create an empty config (HuggingFace.HGFBertConfig((;))) and overwrite it like the example above (replace the hgf"bert-base-uncased:config" with our empty config).

MNLubov commented 1 year ago

@chengchingwen, thank you for your guidance on how to create transformer models in the new version of the package.