issues
search
CODAIT
/
Identifying-Incorrect-Labels-In-CoNLL-2003
Research into identifying and correcting incorrect labels in the CoNLL-2003 corpus.
Apache License 2.0
12
stars
2
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Changed label of national team names from LOC -> ORG incompatible with MUC guidelines
#43
andreasgrv
opened
1 year ago
0
Code for Identifying Incorrect Labels
#42
GSidiropoulos
opened
2 years ago
3
Reproduce results
#41
GSidiropoulos
closed
2 years ago
1
Broken with Pandas 1.2
#40
jcklie
closed
3 years ago
3
Fix data entry errors pointed out in issues #37 and #38
#39
frreiss
closed
3 years ago
0
Sports teams in test split
#38
alanakbik
closed
3 years ago
3
One entity marked as "MIC" instead of "MISC"
#37
alanakbik
closed
3 years ago
2
Adjust audited files so that we can generate all_conll_corrections_combined.csv automatically
#36
frreiss
closed
3 years ago
8
Correct missing errors after span errors
#35
xuhdev
closed
4 years ago
1
download_and_correct_corpus.py does not handle "Missing" errors that overlap with "Span"
#34
frreiss
closed
4 years ago
2
Add missing EOLs in token corrections, and some corrections to the token correction file
#33
xuhdev
closed
4 years ago
8
Temporarily remove token processing that generates errorneous results
#32
xuhdev
closed
4 years ago
6
Remove leading column from merged corrections file.
#31
frreiss
closed
4 years ago
4
Fix correction script with span [16, 22): 'S Minn'
#30
BryanCutler
closed
4 years ago
2
Cleaned up and updated READMEs
#29
BryanCutler
closed
4 years ago
1
Correct root cause of combined correction patches in annotated files
#28
BryanCutler
closed
3 years ago
2
Delete all specific skipping messages
#27
xuhdev
closed
4 years ago
4
Patching all_conll_corrections_combined to remove errors and warnings
#26
BryanCutler
closed
4 years ago
6
Fix lines tagged as "I-O"
#25
BryanCutler
closed
4 years ago
7
Remove some lines that lead to "invalid tag O" warning
#24
xuhdev
closed
4 years ago
4
Inter-annotator agreement notebook give div by 0 error
#23
BryanCutler
opened
4 years ago
1
More processing of ensuring the proper type (I or B)
#22
xuhdev
closed
4 years ago
0
Cleanup and verify scripts
#21
BryanCutler
closed
4 years ago
4
For the first token in a span correction, type should be "B-" if the token before it is "B-" or "I-"
#20
xuhdev
closed
4 years ago
6
Cleanup READMEs
#19
BryanCutler
closed
4 years ago
0
Verfiy scripts and notebooks
#18
BryanCutler
closed
4 years ago
0
About 10 lines are tagged as "I-O"
#17
xuhdev
closed
4 years ago
5
Cleanup download and correct script
#16
BryanCutler
closed
4 years ago
7
Update sentence boundaries file
#15
frreiss
closed
3 years ago
1
Add detailed steps on how we produced the experimental results
#14
xuhdev
closed
4 years ago
5
Add instructions and scripts on how to reproduce Section 7.2 experiments
#13
xuhdev
closed
4 years ago
1
Span marked incorrect in dev fold document 7 may actually be correct
#12
frreiss
opened
4 years ago
1
Incorporate sentence boundary error corrections into the integration script
#11
xuhdev
closed
4 years ago
2
Update script to compute precision and recall
#10
BryanCutler
closed
4 years ago
2
Fix errors in Label_Stats notebook
#9
BryanCutler
closed
4 years ago
2
All corrected datasets are single-line files
#8
xuhdev
closed
4 years ago
2
In the finalized dataset, examine whether in the test corpus, "Germay" line ends up with a space
#7
xuhdev
closed
4 years ago
1
download_corpus_and_correct_labels should also apply sentence boundary corrections
#6
xuhdev
closed
4 years ago
0
Audits for pending files
#5
kmh4321
closed
4 years ago
2
Some spots require correction by hand
#4
xuhdev
closed
4 years ago
1
Fix label warnings on correction script
#3
BryanCutler
closed
4 years ago
3
Automate token corrections
#2
frreiss
closed
3 years ago
4
two misaligned/missing entities `[224, 257): ‘OCASEK GOVERNMENT OFFICE BUILDING’` and `[21, 24): ‘T&N’`
#1
xuhdev
closed
4 years ago
1