cwrc / CWRC-WriterBase

The base class from which to create a CWRC-Writer XML editor.
GNU General Public License v2.0
14 stars 3 forks source link

NERVE-generated entities, when selected in the NERVE panel are showing an ofset string selected in the text area #234

Open ilovan opened 4 years ago

ilovan commented 4 years ago

The offset is inherited by the entity once accepted. See for example Lisbon in: https://dev-cwrc-writer.cwrc.ca/?githubPath=TestGitWriterSchemaCSSchange&githubRepo=ilovan%2FGit-Writer-tests

The offset seems to be bigger the longer the document is: See : https://dev-cwrc-writer.cwrc.ca/?githubPath=Salomania&githubRepo=ilovan%2FNERVE-integration-tests

image.png

Discovered while testing https://github.com/cwrc/CWRC-WriterBase/issues/171

ajmacdonald commented 4 years ago

This issue seems to be caused by NERVE not handling unicode characters properly. Trying running NERVE on this document for an example https://dev-cwrc-writer.cwrc.ca/?githubPath=Salomania&githubRepo=ajmacdonald%2Fgitwriter In the NERVE response “Viral” gets converted into ¬タワViral¬タン. This is what's causing the offset issue.

ilovan commented 4 years ago

That's unfortunate and it seems like a newly introduced problem, though nobody has been changing NERVE recently, to my knowledge. Any recommendations about how to proceed?

ajmacdonald commented 4 years ago

We recently introduced a proxy in GitServer for NERVE but I've determined that that's not the cause, as the issue still occurs when trying directly as outlined here: https://github.com/cwrc/NERVE#testing-api-via-curl

Maybe Jeff might have some insight on possible solutions.

ilovan commented 4 years ago

OK, I'll consult with him next week when he is back with CWRC. Tentatively reassigning the ticket to him.

jefferya commented 4 years ago

@ilovan @ajmacdonald Is there a usecase for a charset other than UTF-8? I've what I think is a quick fix for the UTF-8 case. To support other charsets, a bigger refactor is required. See a number of Todos

jefferya commented 4 years ago

@ilovan @ajmacdonald The DockerHub dev instance is updated. Is there a way to test the image: cwrc/nerve:latest-dev?

lucaju commented 4 years ago

I've just tested on the dev server. No sign of offset in the document mentioned. I couldn't reproduce the bug.

ilovan commented 4 years ago

correct - no more offset issue (tested here: https://github.com/cwrc/CWRC-WriterBase/issues/234#issuecomment-579362637, which in the past was impossible to NER because of the offset)