Binary RC Konfigs - Githubissues

leonhardhennig commented 2 years ago

Vorlage wäre das Relex-Projekt (der Vorgänger von Sherlock): https://github.com/DFKI-NLP/RelEx/tree/master/configs/relation_classification/tacred . Im Prinzip müssten wir die Konfigs nur übernehmen und ggf. anpassen (falls sich bei AllenNLP von 0.9.0 auf 2.8.0 was an den Konfigs geändert hat). die "relex/baseline_self_attention_tacred_bert.jsonnet" entspricht der Konfig "transformer.jsonnet"

alle AllenNLP baseline_*.jsonnet analog in sherlock/configs/binary_rc (sinnvoll benannt ). wenn bert dann jeweils cased/uncased. self_attention -> transformers, falls möglich mit elmo und glove. bert können wir uns sparen, das geht dann mit HF Transformers einfacher
für HF Transformer müssen wir nur evt. ein Trainingsskript anlegen, in dem die ganzen "Start-Parameters" durchiteriert werden
- bert - bert-base-uncased und cased, bert-large cased/uncased
- https://github.com/DFKI-NLP/sherlock/issues/29 (Albert large, Robert large, SpanBert?)
- Luke
- XLM-Roberta, Electra ?, Distilbert ?
- und/oder noch mal best papers bei https://paperswithcode.com/sota/relation-extraction-on-tacred checken, ob da noch interessante Architekturen drin sind (z.B. KnowBERT, ERNIE?)

leonhardhennig commented 2 years ago

in branch "config" baseline_boe.jsonnet. Getestet auf tacrev, ok.

TODO: Erweitern auf mehr Modelle wenn sinnvoll

phucdev commented 2 years ago

We will stick to using https://github.com/DFKI-NLP/sherlock/blob/master/scripts/cluster/binary_relation_clf_en.sh and perform some hyperparameter tuning to get a good enough model for the prediction

phucdev commented 2 years ago

I kept the batch size fixed at 32 and trained the model using different learning rates (2e-5, 3e-5, 4-5, 5e-5) for 5 epochs.

The best bert-base-uncased model with a learning rate of 4e-5 achieved a F1 score of 0.883 on the test split of the unionized relation dataset. The best roberta-base model with a learning rate of 4e-5 achieved a F1 score of 0.892 on the test split of the unionized relation dataset.

DFKI-NLP / sherlock

Binary RC Konfigs #53