result is not equivialent when using UTF-8 instead of latin1

Hi,

I created a partition from GermaParl

coi <- partition("GERMAPARL",
                   interjection= F,
                   encoding = "UTF-8",
                   p_attribute = c("word", "lemma"),
                   role = c("mp", "government"))

when I used kwic()

kwic(coi, query = '".*[Aa]us.*bürger.*"')

R returns an warning message: ... getting corpus positions ... no matches for query (or no matches left after applying stoplist/positivelist) NULL Warning message: In .local(.Object, ...) : No hits for query ".*[Aa]us.*bürger.*" (returning NULL)

Instead of using UTF-8 I used the latin1 encoding and the result shows 73 hits ... getting corpus positions ... number of hits: 73 ... checking that all p-attributes are available ... getting token id for p-attribute: word ... generating contexts.

This is a problem when using further workflows for highlighting text as well as for reading it because of the encoding.

PolMine / GermaParl

result is not equivialent when using UTF-8 instead of latin1 #11