fhamborg / NewsWCL50

The first, open access evaluation dataset for methods to identify bias by word choice and labeling
Creative Commons Attribution Share Alike 4.0 International
23 stars 5 forks source link

Corrupt line #1

Closed tnusser closed 5 years ago

tnusser commented 5 years ago

https://github.com/fhamborg/NewsWCL50/blob/888677e3f79567132609582def8cfc54bb7d7e2a/Annotations.csv#L7919 https://github.com/fhamborg/NewsWCL50/blob/888677e3f79567132609582def8cfc54bb7d7e2a/Annotations.csv#L7920 https://github.com/fhamborg/NewsWCL50/blob/888677e3f79567132609582def8cfc54bb7d7e2a/Annotations.csv#L7921

Need to be joined

fhamborg commented 5 years ago

Thanks for the issue, we are looking into it

anastasia-zhukova commented 5 years ago

The coded phrase starts with the middle of the word, and this word does not exist among the tokens and, therefore, the entire phrase was marked as missing.

anastasia-zhukova commented 5 years ago

But I think, it is a coding error: the coded element is just over too big. 2019-03-13_15-52-00

fhamborg commented 5 years ago

fixed