XStannce Topic Classification - Githubissues

OpenGPTX / lm-evaluation-harness

A framework for few-shot evaluation of autoregressive language models.

MIT License

10 stars 8 forks source link

XStannce Topic Classification #43

Open malteos opened 2 years ago

malteos commented 2 years ago

Use xstance benchmark for topic classification instead of stance detection.

See https://github.com/OpenGPTX/lm-evaluation-harness/blob/master/lm_eval/tasks/x_stance.py

aishwaryaanegundi commented 1 year ago

Evaluation of topic classification task on XStance dataset

HF dataset: https://huggingface.co/datasets/x_stance
Task: xstance_tc
Metrics: Classification Accuracy (acc, precision, recall, f1)

Zero-shot Results

From literature

No works found on topic classification for XStance dataset

From eval harness

Model: gpt2

With only question in prompt

Task	Version	Metric	Value		Stderr
xstance_tc	0	acc	0.1
		precision	0.0303
		recall	0.0909
		f1	0.0455

With both question and comment included in the prompt

Task	Version	Metric	Value		Stderr
xstance_tc	0	acc	0.1
		precision	0.0227
		recall	0.0909
		f1	0.0364

Comments

XStance dataset consists of question and comment. The performance is slightly better with just question included in the prompt