Open jacobwegner opened 2 years ago
Another set of content:
Would also be good to figure out a pipeline for annotation curators (not just developers)
@gregorycrane Small status update, and we can talk more on Tuesday.
I've gotten the first pass of this deployed to beyond-translation-dev
.
Here are the commentary annotations shown on the perseus-grc2
edition:
and for reference, the msA
edition (which also has the older "scholia" widget in place):
--
A few things to note:
1) We're currently trying to match the fragment (called lemma
in the hmt data, but not truly a lemma (c.f. TEI spec) with the word tokens.
The hmt
annotations aren't 1:1 with the text of the msA
:
# the annotation
urn:cts:greekLit:tlg5026.msA.hmt:1.4.lemma#θεά
urn:cts:greekLit:tlg5026.msA.hmt:1.4.comment#οὕτως εἴωθε τὴν Μοῦσαν καλεῖν· ἀμέλει καὶ ἐν Ὀδυσσεία · ⁑
...
urn:cts:greekLit:tlg5026.msA.hmt:1.4#urn:cite2:cite:verbs.v1:commentsOn#urn:cts:greekLit:tlg0012.tlg001.msA:1.1
# the text
1.1#Μῆνιν ἄειδε θεὰ Πηληϊάδεω Ἀχιλῆος
So if we want to do anything to correct the fragments (to match our versions) or create some other form of standoff annotation for the scholia (around an actual lemma or using a token offset, e.g. urn:cts:greekLit:tlg5026.msA.hmt:1.4
applies to the 2nd token in 1.1, etc), but I don't think I can do a whole lot more with the current data set as is.
2) Speaking of current data sets... Neel and Chris have a newer release of the HMT data that might have some corrections / differences. I can try and circle back and update in the next week or two.
3) There are a couple of remaining functionalities I'd like to do with the commentary widget:
Scholia
) widgetI would also have a few things to tighten up on the code before things are ready for production, but figured getting early feedback would be useful.
@gregorycrane I spent an hour experimenting with using fuzzy string matching against the HMT data set. More things to tune, but resulted in a pretty good "coverage" improvement.
"Exact" matching:
"Fuzzy" matching:
By "pretty good", I mean that we can link the commentary fragment to at least one token 100% of the time; there still some partial match or boundary issues to resolve though.
CommentaryWidget
to scaife-viewer/frontend