huggingface / optimum-graphcore

Blazing fast training of 🤗 Transformers on Graphcore IPUs
Apache License 2.0
81 stars 34 forks source link

bert and roberta models #402

Closed upunaprosk closed 1 year ago

upunaprosk commented 1 year ago


BERT-base and RoBERTa-base and most of the other models are not available now (anymore). Is it planned to add them soon? Is it possible to use older versions of those?


jimypbr commented 1 year ago


What issue are you seeing? The models on the hub that end with -ipu like bert-base-ipu and roberta-base-ipu are intended only as IPUConfig files. The usage is:

from optimum.graphcore import IPUConfig
ipu_config = IPUConfig.from_pretrained("Graphcore/roberta-base-ipu")

For the weights of the actual model you can use the same weights as you would on CPU or GPU - e.g. bert-base-uncased. Does this answer your question?

upunaprosk commented 1 year ago

Yes, thank you! I followed an example where the model weights are initialized from graphcore hug. hub: Model = transformers.BertForQuestionAnswering.from_pretrained("Graphcore/bert-large-uncased") which seems can be replaced by bert-large-uncased in that notebook.

jimypbr commented 1 year ago

Yeah it can be initialized with those weights. Graphcore/bert-large-uncased is an equivalent version bert-large-uncased that was pretrained on IPU. But either version will work for fine-tuning or inference on IPU. The "models" on the hub that end with -ipu are intended just as IPUConfigs, if you want to use those, though you can easily create your own if you want customization.