dbmdz / berts

DBMDZ BERT, DistilBERT, ELECTRA, GPT-2 and ConvBERT models
MIT License
155 stars 12 forks source link

Training on Ner #24

Closed francescogianferraripini closed 4 years ago

francescogianferraripini commented 4 years ago

Hi. How can I train xxl italian model for downstream NER task?

stefan-it commented 4 years ago

Hi @francescogianferraripini ,

for all experiments with Italian NER we used Transformers library.

A good documentation can be found here.

The "hardest" part here is preprocessing your data. The Transformers fine-tuning script for token classification (NER, PoS tagging, chunking...) expects a "Token Label" per line format (empty lines denote new sentence).

After the preprocessing part you can just use a json-based configuration file, where you specify all necessary parameters/hyper-parameters, like:

{
    "data_dir": ".data",
    "labels": "labels.txt",
    "model_name_or_path": "dbmdz/bert-base-italian-xxl-cased",
    "output_dir": "bert-base-italian-xxl-cased-model-1",
    "max_seq_length": 128,
    "num_train_epochs": 10,
    "per_device_train_batch_size": 16,
    "save_steps": 703,
    "seed": 1,
    "do_train": true,
    "do_eval": true,
    "do_predict": true,
    "load_best_model_at_end": true,
    "fp16": true,
    "overwrite_output_dir": true
}

Then you can run the examples/token_classification/run_ner.py script and pass the json-based configuration file as first argument.

Fine-Tuning will start then, and the fine-tuned model is stored in the specified output_dir then :)

I hope this helps!

francescogianferraripini commented 4 years ago

Thanks a lot! I was more or less on that path but this greatly helps.