The "label" column in the JSTS dataset is a string dtype

yahoojapan / JGLUE

JGLUE: Japanese General Language Understanding Evaluation

Creative Commons Attribution Share Alike 4.0 International

302 stars 19 forks source link

Hi, thanks for publishing JGLUE.

I think that run_glue.py determines if a task is a regression task or not by the dtype of the label column, so if it is a string dtype, it is treated as a classification task. https://github.com/huggingface/transformers/blob/v4.9.2/examples/pytorch/text-classification/run_glue.py

In fact, fine-tuning BERT in JSTS resulted in a 26-value classification model. (I have patched run_glue.py.)

yahoojapan / JGLUE