FrankensteinVariorum / fv-postCollation

a repository for post-processing finalized collation files to prepare the Variorum edition.
2 stars 0 forks source link

Script to wed Hypothesis annotations with TEI #3

Closed mdlincoln closed 5 years ago

mdlincoln commented 5 years ago

Once we resolve #2, determine what method to use to transform Hypothesis JSON into the TEI (and who will be responsible for it)

mdlincoln commented 5 years ago

Depends on #1

mdlincoln commented 5 years ago

@raffazizzi @ebeshero do take a look at https://github.com/PghFrankenstein/fv-data/blob/master/hypothesis/sample_hypothesis.json and think about what fields form the metadata you're going to be interested in. If it would be helpful to transform this into a specific format for you, then let me know and I can reshape it - on the other hand, if you're happy enough working with this JSON file as is, then go for it.

mdlincoln commented 5 years ago

Also a prettily-formatted sample of one record: https://github.com/PghFrankenstein/fv-data/blob/master/hypothesis/pretty_hypothesis.json

{
  "updated": "2017-12-30T17:10:55.246721+00:00",
  "group": "GwWrAWaw",
  "target": [
    {
      "source": "https://ebeshero.github.io/Pittsburgh_Frankenstein/Frankenstein_1818.html",
      "selector": [
        {
          "endContainer": "/p[9]",
          "startContainer": "/p[9]",
          "type": "RangeSelector",
          "startOffset": 460,
          "endOffset": 552
        },
        {
          "type": "TextPositionSelector",
          "end": 12504,
          "start": 12412
        },
        {
          "exact": "voyages which have been made in the prospect of arriving at the North Pacific\n         Ocean",
          "prefix": "ccounts of the\n         various ",
          "type": "TextQuoteSelector",
          "suffix": "\n         through the seas which"
        }
      ]
    }
  ],
  "links": {
    "json": "https://hypothes.is/api/annotations/zzMgaj_qEeeDXGtNGARk5Q",
    "html": "https://hypothes.is/a/zzMgaj_qEeeDXGtNGARk5Q",
    "incontext": "https://hyp.is/zzMgaj_qEeeDXGtNGARk5Q/ebeshero.github.io/Pittsburgh_Frankenstein/Frankenstein_1818.html"
  },
  "tags": [
    "voyages",
    "Northwest Passage",
    "geography"
  ],
  "text": "Both commercial and scientific voyages have been searching for a Northwest passage or open seaway between the Atlantic and Pacific oceans. For the Arctic context of the novel, see Adriana Craciun, \"Writing the Disaster: Franklin and *Frankenstein*,\" *Nineteenth-Century Literature,* 65.4 (2011): 433-80.",
  "created": "2017-05-23T19:05:38.879749+00:00",
  "uri": "https://ebeshero.github.io/Pittsburgh_Frankenstein/Frankenstein_1818.html",
  "flagged": false,
  "user_info": {
    "display_name": null
  },
  "user": "acct:jonklancher@hypothes.is",
  "hidden": false,
  "document": {
    "title": [
      "Frankenstein (1818)"
    ]
  },
  "id": "zzMgaj_qEeeDXGtNGARk5Q",
  "permissions": {
    "read": [
      "group:GwWrAWaw"
    ],
    "admin": [
      "acct:jonklancher@hypothes.is"
    ],
    "update": [
      "acct:jonklancher@hypothes.is"
    ],
    "delete": [
      "acct:jonklancher@hypothes.is"
    ]
  }
}
ebeshero commented 5 years ago

Thanks, @mdlincoln ! Here's a link to the original issue on the Pgh-Frankenstein repo with comments from the hypothes.is developers, since evidently they developed this JSON output on our request! https://github.com/PghFrankenstein/Pittsburgh_Frankenstein/issues/47#issuecomment-429058818

ebeshero commented 5 years ago

We're going to want to think about this shortly (as in next week), as soon as I've got the new edition files and spine files ready for the Variorum.

raffazizzi commented 5 years ago

@ebeshero @mdlincoln if we're still planning of applying for future funding, I think we should delay some of these decisions to a later project, as there's room to flesh things out more. Lots of interesting issues about attaching annotations to the spine so that it "propagates" to all the other texts.

I would suggest to simply attach them to 1818 for now (as they have been so far) and save the hard modelling and scripting work for later.

ebeshero commented 5 years ago

@raffazizzi Yes—that seems the simplest plan for the present, and let’s take a closer look at how we can do it with the new batch of edition files coming. I agree, integrating with the full edition is a great activity for a grant funding request.

mdlincoln commented 5 years ago

I would suggest to simply attach them to 1818 for now (as they have been so far)

wait, where in the 1818 TEI have they been attached?

ebeshero commented 5 years ago

@mdlincoln The annotations are only "attached" in the sense that they were initially made on my posting of the 1818 edition. (So they're not attached "under the hood" in the TEI code yet.)

mdlincoln commented 5 years ago

@mdlincoln The annotations are only "attached" in the sense that they were initially made on my posting of the 1818 edition. (So they're not attached "under the hood" in the TEI code yet.)

in which case I'm afraid I don't understand what @raffazizzi means by attaching them there, since it was my understanding the goal is to get them out of hypothesis? Or do you mean, extracting them from hypothesis and then attaching them to the original TEI of the 1818 ed?

raffazizzi commented 5 years ago

Or do you mean, extracting them from hypothesis and then attaching them to the original TEI of the 1818 ed?

Close: I mean extracting them from hypothesis and then attaching them to the newly generated TEI (i.e. after @ebeshero's collation process) of the 1818 edition.

mdlincoln commented 5 years ago

@raffazizzi got it! thanks for clarifying