melmothx / text-amuse

Text::Amuse parser
http://www.amusewiki.org
5 stars 2 forks source link

annotation-style notes #61

Open bibliotechie opened 4 years ago

bibliotechie commented 4 years ago

We would like to have support for notes that operate as annotations. This would be similar to a footnote, but instead of operating on a singular point in the text, it would be associated with a range of text. (How exactly this would be rendered is, for now, not super important. The gist though would be some highlighted text, with the note either in the margin, or linked to at the bottom of the page/text, like footnotes are now.)

Some background

Myself and @mycelian (so far) are working on an idea for a site that integrates a discussion platform into Amusewiki. The broad idea would be something similar to margin notes in a book passed around a group of friends, built around the web annotation standard using client code from Hypothesis. Since this is obviously a huge diversion from the current use-case for Amusewiki, we're developing it as a fork, but our hope is to keep things tracked as closely with upstream as possible, and depending on how well it works, maybe in the future merge the project upstream, behind a feature flag.

The current problem

I'll skip the details of how we plan on integrating the annotation platform over all, but what matters here is that we'd like to stick with the spirit of Amuse, in that we want the annotations to be represented within the markup, so that the annotations can be compiled into LaTeX from the muse file, are tracked via the git backend on Amusewiki, etc.. Ideally, this would mean getting the syntax for representing these annotations implemented here (at some point adding them to your compiler would be nice too, but that's less important). If, for some reason, that's not possible, the next best thing would to at least have a syntax design for our fork that interacts with upstream as nicely as possible, so input would still be appreciated.

Proposed solution

As I said at the beginning, the solution we have in mind treats annotations as akin to footnotes. Unlike the two extant footnotes, the design needs to associate a text range with the note body. The solution we've come up with is a mixture of footnote and link syntax:

First sentence. {{Second sentence.}{id}} Third sentence.

{{id}} Insightful comment.

The above is intended to resemble footnotes, but it could be made to resemble links more by swapping the parameters, {{id}{Second sentence.}}. One point that does involve our integration is that it would be nice, but not necessary, if the id could be a string, rather than an integer (it would make integrating with other web annotation implementations a little easier).

Obviously there's a lot to tweak in the design, and other possible designs work as well. Let us know what you think, or if there's anything we should clarify.

melmothx commented 4 years ago

bibliotechie notifications@github.com writes:

We would like to have support for notes that operate as annotations. This would be similar to a footnote, but instead of operating on a singular point in the text, it would be associated with a range of text. (How exactly this would be rendered is, for now, not super important. The gist though would be some highlighted text, with the note either in the margin, or linked to at the bottom of the page/text, like footnotes are now.)

Some background

Myself and @mycelian (so far) are working on an idea for a site that integrates a discussion platform into Amusewiki. The broad idea would be something similar to margin notes in a book passed around a group of friends, built around the web annotation standard using client code from Hypothesis. Since this is obviously a huge diversion from the current use-case for Amusewiki, we're developing it as a fork, but our hope is to keep things tracked as closely with upstream as possible, and depending on how well it works, maybe in the future merge the project upstream, behind a feature flag.

Hi,

FWIW I looked into Hypothesis some years ago, I didn't follow up on that, but I understand what you're talking about.

Of course if you keep the fork close, if this thing gets implemented I will be glad to merge it back. Please complain if you think I'm disrupting something and I'm sure we can find a solution.

The current problem

I'll skip the details of how we plan on integrating the annotation platform over all, but what matters here is that we'd like to stick with the spirit of Amuse, in that we want the annotations to be represented within the markup, so that the annotations can be compiled into LaTeX from the muse file, are tracked via the git backend on Amusewiki, etc.. Ideally, this would mean getting the syntax for representing these annotations implemented here (at some point adding them to your compiler would be nice too, but that's less important). If, for some reason, that's not possible, the next best thing would to at least have a syntax design for our fork that interacts with upstream as nicely as possible, so input would still be appreciated.

Proposed solution

As I said at the beginning, the solution we have in mind treats annotations as akin to footnotes. Unlike the two extant footnotes, the design needs to associate a text range with the note body. The solution we've come up with is a mixture of footnote and link syntax:

First sentence. {{Second sentence.}{id}} Third sentence.

{{id}} Insightful comment.

The above is intended to resemble footnotes, but it could be made to resemble links more by swapping the parameters, {{id}{Second sentence.}}. One point that does involve our integration is that it would be nice, but not necessary, if the id could be a string, rather than an integer (it would make integrating with other web annotation implementations a little easier).

A note and a couple of questions.

The secondary {1} style footnote proved to be kind of buggy on the LaTeX side, with footnotes landing on the wrong page. Just for you to know.

When I'm getting the chance I'm going to look if there something which can be done for that.

Now, I'm looking at your example and opens up a couple of questions.

Is guaranteed that the reference text always fall inside a paragraph? Would be legit to span across various paragraphs? Because this would pose a lot of implementation headaches:

First {{sentence which needs annotation.

Anothe paragraph}{id}}. And we continue.

Second, I'm not really sure why you need the id at all. Couldn't be just this:

{{sentence...}{annotation...
eventually with more paragraphs}}

And expanding to a margin note or footnote or whatever?

I'm still mumbling about this whole stuff, though, these are just my first concerns.

Please let me know.

Thanks

-- Marco

bibliotechie commented 4 years ago

Hi, FWIW I looked into Hypothesis some years ago, I didn't follow up on that, but I understand what you're talking about. Of course if you keep the fork close, if this thing gets implemented I will be glad to merge it back. Please complain if you think I'm disrupting something and I'm sure we can find a solution.

Great to hear, thanks. :)

Is guaranteed that the reference text always fall inside a paragraph? Would be legit to span across various paragraphs? Because this would pose a lot of implementation headaches: First {{sentence which needs annotation. Anothe paragraph}{id}}. And we continue.

I suppose it should be possible to span across paragraphs, but that to me seems like such a rare use I don't think it's a deal-breaker. I think it would be reasonable to just document it as a technical limitation.

Second, I'm not really sure why you need the id at all. Couldn't be just this: {{sentence...}{annotation...
eventually with more paragraphs}} And expanding to a margin note or footnote or whatever?

Yes, that's one of the designs we looked at. We opted for the ID for two reasons:

If you prefer fully inline though, I'm sure we can find a way to work with it.

melmothx commented 4 years ago

bibliotechie notifications@github.com writes:

I suppose it should be possible to span across paragraphs, but that to me seems like such a rare use I don't think it's a deal-breaker. I think it would be reasonable to just document it as a technical limitation.

Second, I'm not really sure why you need the id at all. Couldn't be just this: {{sentence...}{annotation...
eventually with more paragraphs}} And expanding to a margin note or footnote or whatever?

Yes, that's one of the designs we looked at. We opted for the ID for two reasons: - It can make the markup a bit more readable; e.g., if there's a single word in a sentence annotated, it's easier to read the sentence with a "highlighted" word and find the annotation elsewhere, than having the sentence interrupted by another sentence. - For the purposes of implementing faster search, RSS, etc., the annotations will also be stored or referenced in a database. Having a unique identifier (either globally, or just in that document) to reference makes mapping between the two simpler/less error prone, at least in theory.

If you prefer fully inline though, I'm sure we can find a way to work with it.

I'm still in the mumbling phase (I hope you're not in a hurry for this), but what happens if you have two annotations for the same sentence? I don't think it's going to be so rare.

So I'm throwing this idea to you:

Sentence {{id}}this part is {{id2}}important{{/id}}, and this{{/id2}} is also.

Which would expand to this:

Sentence this part is important\annotation{...}, and this\annotation{...} is also.

and in HTML

Sentence this part is <span class="start-annotation-2">important<span class="end-annotation-1">, and this<span class="end-annotation-2"> also.

Something like that.

It gives some meaning to the ID, and you could actually have overlapping annotations.

Finally, in the application I think you can do the highligth with some JS. I don't know how, but I think it can be possible given enough information is put in the markup (with the last famous words clause applied).

-- Marco

bibliotechie commented 4 years ago

I'm still in the mumbling phase (I hope you're not in a hurry for this),

No worries, we're not currently blocked on this.

but what happens if you have two annotations for the same sentence? I don't think it's going to be so rare.

Ah, yes the syntax I gave doesn't support overlapping annotations, good catch. We played with a syntax built around XML-style tags that would have addressed this, but ended up scrapping it because it didn't look very muse-like. Your design sounds good to me though.

Finally, in the application I think you can do the highligth with some JS. I don't know how, but I think it can be possible given enough information is put in the markup (with the last famous words clause applied).

The client will take care of this, yes. We likely won't be using the Muse or HTML representation for this though, because that's not how the upstream client works, and our goal is to keep modifications to the client as minimal as possible (not because we intend to upstream our changes in this case, but because it's a project with a lot of churn, and complicated patch sets would be a lot of work to maintain). Instead, our plan is to implement the parts of the Hypothesis API we want in Amusewiki, and point the client at it. The API returns information about the annotation that the client uses to figure out where in the page to apply the highlight. This is one of the reasons for representing the annotations in a database. If you want to implement annotations before we finish the client, then the <span>s might be necessary, but once the client is fully up, we could probably just ignore annotation markup for HTML output. (The client technically works with PDFs as well, but our current plan is to compile the annotations into the LaTeX so that they can be opened with other software, printed, etc.. )

melmothx commented 4 years ago

@bibliotechie I've been mumbling over this for some days by now, and I'm inclined to a no-go. I do see the value of annotations, but for them exist the secondary footnotes (which, as previously noted, have some unwanted side effect which must be resolved before going basically down to a third level).

Especially if you're not going to use the HTML markup for that, I don't see much point in adding a fairly large change to the Muse markup itself, as the PDF/EPUB would just show a note.

Honestly I'm a bit worried to add a feature to the parser which is unclear how much will be used. As you can image, people ask features, then leave, but the feature and the code to maintain, stays.

If you can make it work with the existing set of features (secondary footnotes), it would be great. Otherwise I think we can cycle back when your project is more fleshed out, so I can take a look at the implementation and come back with a more shared and reasoned proposal than the current one. Please do not underestimate the value of seeing the whole picture.

I know that is not going to sound too good for you, but I hope you'll understand. I'm not totally opposed to the thing, I just think it's too early.