PolMine / GermaParl

GermaParl R Data Package
12 stars 3 forks source link

Encoding errors for quotation marks in lp 13-15 #9

Open Studentenfutter opened 5 years ago

Studentenfutter commented 5 years ago

I encountered 1988 errors in the encoding of quotation marks in GermaParl, at least in lp 13 - 15. In interjections, leading upper quotation marks in German " seem to be replaced by the string

'' '' '' '' '' '' '' ``

Code for reproducing:

interjection_subcorpus <- partition("GERMAPARL", interjection = TRUE, lp= 13:15)
kwic(interjection_subcorpus, query="\`\`", regex = TRUE, left=15, right=15, s_attributes=c("lp", "session"))