The README has output examples in which for each model {BERT, GANBERT}, {eval_accuracy, eval_f1_micro, eval_precision, eval_recall} are all identical. For example, in the case of BERT:
I ran your model with sh run_experiment.sh, got numerically different results, but the same equality across {eval_accuracy, eval_f1_micro, eval_precision, eval_recall} within each model persists. For example, for GANBERT I get:
I faced the same issue; so, I updated the code to print the confusion matrix as per the labels provided in data_processors.py. This repository might help!
The README has output examples in which for each model {BERT, GANBERT}, {eval_accuracy, eval_f1_micro, eval_precision, eval_recall} are all identical. For example, in the case of BERT:
I ran your model with
sh run_experiment.sh
, got numerically different results, but the same equality across {eval_accuracy, eval_f1_micro, eval_precision, eval_recall} within each model persists. For example, for GANBERT I get:Is it because you're micro averaging and therefore "micro-F1 = micro-precision = micro-recall = accuracy"?