Closed markroxor closed 6 years ago
The challenge of this year’s shared task was to incorporate the unannotated data
The participants were given access to the corpus af-ter some linguistic preprocessing had been done: for all data, a tokenizer, part-of-speech tagger, and a chunker were applied to the raw data.
Named entity tagging of English and German training, development, and test data, was done by hand at the University of Antwerp.
The data contains entities of four types: persons (PER), organizations (ORG), locations (LOC) and miscel-laneous names (MISC).
The most frequently applied technique in the CoNLL-2003 shared task is the Maximum Entropy Model. Five systems used this statistical learning method. Three systems used Maximum Entropy Models in isolation
Hidden Markov Models were employed by four of the systems
Voted perceptrons were applied to the shared task data and Li, 2003) were applied by one system each.
Transformation-based learning (Florian et al., 2003), Support Vector Machines (Mayfield et al., 2003) and Conditional Random Fields
Five participating groups have applied sys- tem combination.
dl.acm.org/citation.cfm?id=1119195