CharterMap / chartermap

GNU General Public License v3.0
1 stars 0 forks source link

Overlapping metadata editing #2

Open CharterMap opened 4 months ago

CharterMap commented 4 months ago

It needs to be possible to mark up a document with various overlapping attributes. Attributes can exist down to the character level, up to a single attribute spanning across the entire document. They will overlap, and they will overlap in messy ways.

For instance, one attribute might state that a section is in Latin, another might link a personal name in the document to an instance of a Person object, and another might link to an instance of a Diplomatic Phrase object.

This is probably going to be a core requirement for marking up the charters at the beginning and in the long term for the project.

CharterMap commented 4 months ago

This might be possible by having a single "base" layer, with every Object type having its own layer on top of that (e.g. a layer for language, a layer for person, a layer for script, for scribe, place, and so on). The fundamental concept is that no two tags within a single layer should have any possibility of overlapping at any point.

How to achieve this, practically? The biggest difficulty I see is in recording where a tag starts and stops in the text:

  1. We can simply say that a tag starts and ends at two numbers, where the numbers are values in the array of characters that form the "base", untagged, text layer. This has the obvious flaw that if any changes are made to the document that result in changes to the number of characters (including invisible and non-printing characters that are just there for formatting and would go unnoticed..!) would mess up the positioning of all tags.
CharterMap commented 4 months ago

If we did position the tags around text based on start and end places in the list of characters, we could mitigate issues arising from changing the base text like this:

  1. When deleting a character, change the start and end values of every tag after the deleted character after this character by -1
  2. When adding a character, change them by +1
  3. When adding a non-printing character, do nothing!
  4. When changing the number of characters between the beginning and end of tags, change the value of the end tag's position correspondingly