What is the correct way to "break down" an annotated post, so that annotations are preserved?

albertocottica commented 2 years ago

In POPREBEL there is debate on when it is appropriate to keep the transcript of an interview as a whole, unbroken, long post, as opposed to breaking it into several posts in the same topic, each containing a question and an answer.

This has lead some ethnographers to want to break down an already annotated post into several smaller posts. Is there a way to do it so that annotations are preserved? Which one whould that be?

tanius commented 2 years ago

We had this question come up once before, and back then the "solution" was that to split annotated Discourse content, the split-off parts would have to be re-annotated manually. If this request is a one-time request about a limited amount of interviews (say, ≤30 long interviews), then that would still be the most efficient solution by far.

Only of this is a repeating requirement that is regularly needed, we could think of software features to make this task more efficient. It would be a week's worth of programming, at least. Such software support might consist of a way to copy a post incl. its annotations to a different topic, and then a way to delete content from a post (or in more general, edit a post) without affecting how annotations are anchored to it. That latter feature would be desirable anyway, because right now annotations can become mangled when a user edits their own post after it has been annotated. (The simpler solution for that, which we have been planning for so far, is to tie annotations to a certain version in the edit history of a post.)

The reason for such features being so complicated is that annotations are stored outside the content of a post, referring to positions in a post with XPath pointers, which are formal expressions that say things like "the starting position is in the second paragraph, in there the first "bold" text part, and in there the 23rd character". Now when post content is edited, the XPath expressions have to be adapted, and due to the nature of the edits the software may be unable to determine how to adapt the XPath expressions. To make it as good at that task as possible, it would have to work with different heuristics one after the other, the last of which being to look for the annotated text (which we store in addition to XPath) and re-achor the XPath expressions to that if it occurs only once in the edited text. Even that last heuristic may lead to a wrong solution under specific circumstances. There is just no way of doing this perfectly right, which is a problem that programmers hate …

albertocottica commented 2 years ago

Noted with thanks.

damingo commented 2 years ago

The simpler solution for that, which we have been planning for so far, is to tie annotations to a certain version in the edit history of a post.

By now this has been implemented. Does it resolve this issue so we can close it? @albertocottica

edgeryders / discourse-annotator

What is the correct way to "break down" an annotated post, so that annotations are preserved? #228