Open myrix opened 1 month ago
The issue is mostly resolved. Main points:
OdtMarkupModal
component now we have internal representation of parserresult
as JSON. It is generated in backend and transfered through network.valency
) use html. So in database we stil store our parserresult as html and convert it every time to json and back on backend side.browserselection
and in order to find textnode index within whole text.
Current implementation of parser result processing is problematic.
Parser results with disambiguation info are stored as plain text html, see DB table
parserresult
attributecontent
, are displayed in the interface as is, https://github.com/ispras/lingvodoc-react/blob/39b00004b5f94014ad0de095fbfd258fcc64bafa/src/components/OdtMarkupModal/index.js#L505 and are modified by directly taking and saving interface HTML source as is, https://github.com/ispras/lingvodoc-react/blob/39b00004b5f94014ad0de095fbfd258fcc64bafa/src/components/OdtMarkupModal/index.js#L396This is obviously unsafe and leads to problems when there are unintended interface HTML source modifications, e.g. when the interface page is modified by translation extensions or built-in translation browser functionality, messing up parser result HTML markup structure.
We need to fix this by properly storing parser result data in explicit internal representation format, e.g. as JSON, both on the backend and the frontend, so that interface would explicitly display, modify and save this representation ensuring its integrity.
Naturally, all functionality which uses parser results as source data, in particular valency example extraction, should be suitably updated. Also, it might be beneficial to store parser results not as whole big JSON documents, but separately by paragraphs or even paragraphs and sentences to simplify processing and editing, in particular allowing to minimize data exchange between frontend and backend when saving disambiguation updates, though that will require more extensive modifications to
parserresult
DB table (and perhaps intoduction of additional helper tables) and source code of corresponding functionality and should be carefully considered before deciding whether to go for it or not.It may very well be possible that to a certain extent work on this issue would be better done concurrently with other current issues pertaining to handling of parser results and their derivatives.