CODAIT / Identifying-Incorrect-Labels-In-CoNLL-2003

Research into identifying and correcting incorrect labels in the CoNLL-2003 corpus.
Apache License 2.0
12 stars 2 forks source link

Automate token corrections #2

Closed frreiss closed 3 years ago

frreiss commented 4 years ago

Go through the manual fixes (for token corrections) that were applied towards the end of the paper-writing process and either automate or semi-automate those fixes.

ZachEichen commented 4 years ago

the semi automation for expanding the entries of the teams should be the same as before. logic to find entries and correct them still needs development.

frreiss commented 4 years ago

@ZachEichen any update on this?

frreiss commented 4 years ago

Update from @ZachEichen :

Zach will put in a PR with the first stage of these changes in the next ~1 week.

frreiss commented 3 years ago

The changes in #36 make this requirement less urgent. Most of the manual corrections were due to conflicts between different sets of human labels that have since been resolved.