huggingface / optimum-graphcore

Blazing fast training of 🤗 Transformers on Graphcore IPUs
Apache License 2.0
81 stars 34 forks source link

bert and roberta models #402

Closed upunaprosk closed 1 year ago

upunaprosk commented 1 year ago

Hi,

BERT-base and RoBERTa-base and most of the other models are not available now (anymore). Is it planned to add them soon? Is it possible to use older versions of those?

Cheers,

jimypbr commented 1 year ago

Hi!

What issue are you seeing? The models on the hub that end with -ipu like bert-base-ipu and roberta-base-ipu are intended only as IPUConfig files. The usage is:

from optimum.graphcore import IPUConfig
ipu_config = IPUConfig.from_pretrained("Graphcore/roberta-base-ipu")

For the weights of the actual model you can use the same weights as you would on CPU or GPU - e.g. bert-base-uncased. Does this answer your question?

upunaprosk commented 1 year ago

Yes, thank you! I followed an example where the model weights are initialized from graphcore hug. hub: Model = transformers.BertForQuestionAnswering.from_pretrained("Graphcore/bert-large-uncased") which seems can be replaced by bert-large-uncased in that notebook.

jimypbr commented 1 year ago

Yeah it can be initialized with those weights. Graphcore/bert-large-uncased is an equivalent version bert-large-uncased that was pretrained on IPU. But either version will work for fine-tuning or inference on IPU. The "models" on the hub that end with -ipu are intended just as IPUConfigs, if you want to use those, though you can easily create your own if you want customization.