oaregithub / oare_mono

1 stars 0 forks source link

Represent dashes between signs that are superfluous/erased within a word #992

Closed edstratford closed 2 years ago

edstratford commented 2 years ago

The front end places dashes (or periods) between syllabic signs that are within a word. Occasionally an ancient writer will include an extra character. In English, this would be like: 'He wen(n)t to the store." Where the original writer spelled went with two n's, but an editor in a modern edition of that text would place the second 'n' in parentheses to show the reader it was there, but that we shouldn't look for a separate word 'wennt' somewhere in the dictionary.

In Akkadian, we designate such signs with doubled pointy brackets, for example, TÚG«kà»-kam-sú-um.

Sometimes the ancient writer realized it was a mistake, and thus 'erased' (rubbed out) the word in the clay (smushed the clay). In this case, we use curly brackets, for example, iš-{qú}-qul for iš-qul where the writer accidentally wrote the qú sign, but then smushed it out.

This is stored in the database by a row in the text_markup table, with either the 'superfluous' or 'erasure' types.

The current renderer does not do a good job with such cases. For example, the second line here: image

Need to reconfigure the renderer to place appropriate connective signs (dashes/periods/nothing (between determinative and something else) ) around these signs.

So the example above should display as šu-ru-«tù»-tum. These signs marked as superfluous, like «tù», will not (should not) have a discourse_uuid - which is why the renderer will not currently put dashes around it. But when a sign occurs within a series of rows that have the same discourse_uuid, then that sign in the epigraphy view should still be displayed.

It may be that the renderer encounters the next sign and the discourse_uuid has changed, then it stops using connective markings, but it needs to look further ahead (?) and if the same discourse_uuid re-occurs then treat it as if all such rows have the same discourse_uuid. Perhaps, the renderer could suspend the logic for discourse_uuid is null in cases where the sign also has the text_markup.type = 'superfluous' or 'erasure'

There are currently 981 signs marked as 'superfluous' in text_markup, and 224 as 'erasure'

hbludworth commented 2 years ago

I believe that when adding texts, it does assign these signs the discourse_uuid of the word. This apparently should not be happening. When updating the renderer, I need to make this adjustment.

With this, when you get to the discourse connection step, things get weird. It includes superfluous and erasure items in the word, meaning that there are usually no possible discourse connections. These should be ignored when connecting words.

edstratford commented 2 years ago

Correct - those rows do not get a discourse_uuid. And yeah, this is just a rendered thing. Thanks!