howardyclo / papernotes

My personal notes and surveys on DL, CV and NLP papers.
128 stars 6 forks source link

Auxiliary Objectives for Neural Error Detection Models #4

Open howardyclo opened 6 years ago

howardyclo commented 6 years ago

Metadata

howardyclo commented 6 years ago

Summary

This work can be viewed as a follow-up work on Rei (2017) #3, which porposes a semi-supervised framework for multi-task learning, integrating language modeling as an additional objective, while this work further extend the auxiliary objectives to token frequency, first language, error type, part-of-speech (POS) tags and syntactic dependency tags (grammatical relations). Also, auxiliary datasets (chunking, name entity recognition and POS-tagging) for different training strategies (pre-training v.s. multi-tasking) are investigated. The experiments shows that the auxiliary task of predicting POS+syntactic dependency tags gives a consistent improvement for error detection, and pre-training on auxiliary chunking dataset also improves.


Auxiliary Tasks

Table 1

The task (cross-entropy loss) weights are [0.05, 0.1, 0.2, 0.5, 1.0] in respect to the above task order.


Error Detection Datasets


Experimental Results on Different Auxiliary Tasks for Error Detection

Table 2

Table 3

  1. For evaluation on FCE, the results shows that only predicting error types and syntactic dependency tags help improving performance. It's quite surprising that predicting syntactic dependency tags improves the most unexpectedly, although the syntactic dependency tags will probably incorrectly parsed given a grammatically erroneous sentences.
  2. But for the evaluation on CoNLL-2014, predicting error types and syntactic dependency tags seem not helping a lot like and even worse. Instead, predicting POS tags surprisingly becomes beneficial to error detection.
  3. In conclusion, different auxiliary tasks did not give a consistent improvements (only POS+dependency tags improves). Besides, the evaluation results varies on different test set, which cannot be clearly identified which auxiliary tasks really help to improve error detection.

Experimental Results on Different Training Strategies with Auxiliary Datasets

Table 4&5


Additional Training Data

Trained on larger dataset on CLC, NUCLE and Lang-8 datasets (about 3 million parallel sentences I think...) with auxiliary POS-tagging task and tested on FCE and CoNLL-2014 test sets.

Table 6

  1. No improvement on FCE test set, since FCE is one part of the CLC dataset. It is likely that the available training data is sufficient and the auxiliary objective does not offer an additional benefit.
  2. 1.8% and 1.1% absolute improvements on CoNLL-2014 test sets.

References