Question regarding mark_utf8

In write.output.solution (create_ps.r) you have

out.txt = mark_utf8(out.txt)

I am unsure about its purpose. This line sometimes leads to errors when used before my function "fix.parser.inconsistencies" due to incompatibilities with the stringr package, e.g. regarding stringr::str_length().

Uncommenting the line fixes the error and the resulting solution looks fine to me (in particular regarding Umlauts). Perhaps the following code makes my point more clear:

fix.parser.inconsistencies("Test ü") [1] "Test ü" mark_utf8("Test ü") [1] "Test \xfc" str_length("Test ü") [1] 6 str_length("Test \xfc") [1] 6 str_length(mark_utf8("Test ü")) [1] NA Warnmeldung: In stri_length(string) : invalid UTF-8 byte sequence detected; try calling stri_enc_toutf8() fix.parser.inconsistencies(mark_utf8("Test ü")) [1] "Test �"

I am a bit wary whether uncommenting the line is the way to go, because I do not fully understand what its purpose is. Maybe I found an error in mark_utf8 itself, als str_length("Test \xfc") does work?

skranz / RTutor

Question regarding mark_utf8 #35