google-research / electra

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
Apache License 2.0
2.33k stars 352 forks source link

Finetuning on Tagging task for custom dataset outputs f1 = 0 and #110

Closed etetteh closed 3 years ago

etetteh commented 3 years ago

I created the following class

class BC5C(TaggingTask):
  """Entity-level text chunking."""

  def __init__(self, config, tokenizer):
    super(BC5C, self).__init__(config, "bc5c", tokenizer, True)

The format of the data can be inferred here.

The training during finetuning works okay, but the evaluation on the dev set outputs zero (0) for all metrics except the loss, which is often 0.4.

Has anyone encountered this issue before, and what is the fix?

Joseph-Vineland commented 2 years ago

Did you find a solution? If so, can you please share your code & what you did?

I also have a custom seqeunce tagging task, but ELECTRA is failing just before the final metrics are reported.

EDIT: I used your code snippet and file to find the solution. Thanks!