helboukkouri / character-bert

Main repository for "CharacterBERT: Reconciling ELMo and BERT for Word-Level Open-Vocabulary Representations From Characters"
Apache License 2.0
195 stars 47 forks source link

Fine tuning with Trainer from huggingface #12

Closed MatheusNtg closed 3 years ago

MatheusNtg commented 3 years ago

There is any chance to you guys post a example of how we use your model to do a fine-tuning using the Trainer class from huggingface?

helboukkouri commented 3 years ago

Hi @MatheusNtg, I may add some examples in the future but in the meantime if you feel confortable using code that is still a work in progress you could use this fork of the transformers library which is supposed to prepare a future merge of CharacterBERT into HuggingFace's library: https://github.com/helboukkouri/transformers/tree/add-character-bert

Basically just install the transformers library from the branch above and you should be able to do things like:

from transformers import CharacterBertModel

model = CharacterBertModel.from_pretrained('helboukkouri/character-bert')
# And use this as you would use BertModel

Hope this helps 😊

steveguang commented 3 years ago

Hi, @helboukkouri. I do not see CharacterBertModel in the transformers. It says "ImportError: cannot import name 'CharacterBertModel' from 'transformers' (unknown location)". Did I do it wrong?

helboukkouri commented 3 years ago

Did you install the transformers library from the branch https://github.com/helboukkouri/transformers/tree/add-character-bert ? Basically, just clone the repo (and make sure you switched to the add-character-bert branch) then run pip install . from within the repo.

If you do so, you should be able to run any of the following commands to import CharacterBertModel:

from transformers import CharacterBertModel
from transformers.models.character_bert import CharacterBertModel
from transformers.models.character_bert.modeling_character_bert import CharacterBertModel
MatheusNtg commented 3 years ago

Did you install the transformers library from the branch https://github.com/helboukkouri/transformers/tree/add-character-bert ? Basically, just clone the repo (and make sure you switched to the add-character-bert branch) then run pip install . from within the repo.

If you do so, you should be able to run any of the following commands to import CharacterBertModel:

from transformers import CharacterBertModel
from transformers.models.character_bert import CharacterBertModel
from transformers.models.character_bert.modeling_character_bert import CharacterBertModel

Okay, and about the CharacterIndexer class? How we access this class through transformers? I think that would be nice to have an update on the documentation about how to use the model.

steveguang commented 3 years ago

@helboukkouri yeah, same problem. I can not see to run the examples from modeling.character_bert import CharacterBertModel gives ModuleNotFoundError: No module named 'transformers.modeling_bert'. I am not sure how each repo interacts. Would be good to have an example

helboukkouri commented 3 years ago

Hi @MatheusNtg, @steveguang, sorry fo the delay.

Sorry for not being clear enough. The problems you are encountering arise because you are trying to use the HuggingFace PR code on the examples from https://github.com/helboukkouri/character-bert . But actually, these two code bases are not meant to be used together.

If you already have experience using the BERT classes from the transformers library, then forget about my code and just do the same thing with the CharacterBERT classes you get when installing the library from the PR code.

If you don't know how to use the transformers classes directly, then forget about the PR code and just use the original code from this repo. It may not have all the functionality that you need but it should be enough for doing basic fine-tuning.

======

@MatheusNtg, since you are using the PR code then there is no more CharacterIndexer class. Just use the CharacterBertTokenizer class as you would normally use BertTokenizer.

@steveguang, since you installed the transformers library from the PR, then you have a recent version of the library where you cannot do from transformers.modeling_bert import ... which explains your error. You can try changing the imports into from transformers.models.bert.modeling_bert import ... but I would generally advice against using the PR code and this repo code together.

======

When the PR is merged, I will update this repo to use the transformers classes instead of my own custom ones.

Hope this helps ! 😊

MatheusNtg commented 3 years ago

@helboukkouri just for clarification, when you say "since you are using the PR code" you mean this code? https://github.com/helboukkouri/transformers/tree/add-character-bert

helboukkouri commented 3 years ago

Yep. It's the branch that serves for the ongoing transformers PR 😊