Closed snomos closed 2 years ago
Turns out the problem is soft hyphens in the text sent from Word. So the categorisation above is probably invalid, and just a happy coincidence of the test data used.
To trigger the error, use the following text - it should contain two soft hyphens:
Áigot nannet sámiid konsultašuvdnarievtti
Seems the error is in libdivvun - @unhammer could you have a look?
The issue this time was the character \x1f
was found.
The issue this time was the character
\x1f
was found.
INFORMATION SEPARATOR ONE?
Everyone's favourite codepoint! The input was dutkama ja luonddu\x1fdiehtaga,
What is libdivvun doing wrong? I get
$ echo ' Áigot nannet sámiid konsultašuvdnarievtti' | src/divvun-checker -l se
{"errs":[["konsulta",21,29,"typo","Ii leat sátnelisttus",["konsula"],"Čállinmeattáhus"],["šuvdna",30,36,"typo","Ii leat sátnelisttus",["šuvona","govdna"],"Čállinmeattáhus"]],"text":" Áigot nannet sámiid konsultašuvdnarievtti"}
$ echo ' Áigot nannet sámiid konsultašuvdnarievtti' | src/divvun-checker -l se |hl-nonprinting
{"errs":[["konsulta",21,29,"typo","Ii leat sátnelisttus",["konsula"],"Čállinmeattáhus"],["šuvdna",30,36,"typo","Ii leat sátnelisttus",["šuvona","govdna"],"Čállinmeattáhus"]],"text":" Áigot nannet sámiid konsulta-šuvdna-rievtti"}⁋
from the command line with newest giella-sme-speller (that hl-nonprinting is just a script to sed \xad into a dash and EOL into ⁋).
@snomos has conflated two issues. the \x1f
issue seems to be coming from libdivvun, whereas the soft hyphen issue is our problem.
$ printf ' dutkama ja luonddu\x1fdiehtaga' | src/divvun-checker -l se
{"errs":[],"text":" dutkama ja luonddudiehtaga"}
$ printf ' dutkama ja luonddu\x1fdiehtaga' | src/divvun-checker -l se |hl-nonprinting
{"errs":[],"text":" dutkama ja luonddu^_diehtaga"}⁋
– should we be removing it from input, or are we somehow introducing \x1f's, or am I not reproducing the issue correctly here? (Some "expected this, but got that" examples would be nice ;))
Sorry I am trying to go on vacation, haha.
Error was control character (\\u0000-\\u001F) found while parsing a string
, looks like it was coming from libdivvun, perhaps it isn't. January's problem now.
There was a bug in libdivvun – that character should've been escaped according to the json spec. Fixed now, hopefully might help with this issue.
Everyone's favourite codepoint! The input was
dutkama ja luonddu\x1fdiehtaga,
This seems to be fixed, at least not causing any trouble in neither Word nor GDocs. Even
konsultašuvdnarievtti
(containing two soft hyphens) seems to be fixed. Closing.
I am not able to get GramDivvun to work in MS Word (the local app) when using the UiT account. It can be installed, it loads, and looks the way it should in the initial screen. But when clicking "Check", it almost immediately dies with the following errors in the console:
Screenshot of the same:
I have tested various setups, and most work, but not this one. The ones I have tested are:
The Safari+UiT problem manifests differently, and thus seems to be a different issue, and can be easily worked around by using another browser. So this bug report targets the Office 2016 locally installed app issue only.