substance / texture

A visual editor for research.
MIT License
999 stars 85 forks source link

Semantic tagging for text #1321

Open JGilbert-eLife opened 5 years ago

JGilbert-eLife commented 5 years ago

Description

The ability to add semantic tagging to text in order to designate them as, for example, gene sequences, RRIDs and so on. Currently, gene sequences at eLife is the only live use case.

User stories

Author

  1. As an author, I want to be able to tag a piece of text as a gene sequence so that this information can be picked out for special display, data mining etc.
  2. As an author, I want to be able to see when text has been tagged as a gene sequence so that I can check this has been done correctly.
  3. As an author, I want to be able to remove gene sequence tagging from a piece of text so that I can correct errors.
  4. As an author, I want to be able to add styling (bold, italic etc) to text within a tagged gene sequence so that I can emphasise particular letters.
  5. As an author, I want to be able to edit text tagged as a gene sequence so that I can correct any errors.

But what if . . . ?

Consideration

XML requirements

named-content[@content-type="sequence"] is used for gene sequences/primers:

<p>We used the following primer <named-content content-type="sequence">5´-AGCATCGGACCGGCTTTTTCGAACTGCGGGTGGCTCCAGCTAGCCATGGATCCGCGCCCGATGGTGGGACGGTATG-3´</named-content> ... </p>

On occasion these also include other formatting, used for emphasis, such a bold or italic, and this formatting needs to be retained:

<named-content content-type="sequence">cctaggAACATCCCATAAAACATCCCATATTCAGCCGCTAGCAGT<bold>CAGGATTATTTGTACAAGATA</bold>TAGTTATATTCAAGCATA<italic>TATCTTGTACAAATAATCCTG</italic>GCGAATTCAGGCGAGACATCGGAGTTGAAACTAAAACTGAAATTTACTAGAAAACATCCCATAAAACATCCCATATTCAGCCGCTAGCAGT<bold>TCGGAAGAGAGTAGTAACAAA</bold>TAGTTATATTCAAGCATA<bold>TTTGTTACTACTCTCTTCCGA</bold>GCGAATTCAGGCGAGACATCGGAGTTGAAACTAAAACTGAAATTTCCTAGG</named-content>

If this needed to be extended to other concepts, presumably this could be done with different @content-type values on a named-content element, or similarly with other elements, such as @content-type on p.

Mock ups

Proposal