Open jacobwegner opened 2 years ago
@jtauber I'm going to take a crack at this with something within the project that mirrors https://morph.perseus.org and can eventually be incorporated into https://github.com/jtauber/postag-convert.
I'll ping you on the data PR to see if that I'm doing makes sense.
@jtauber I may want some help with this after all; maybe we can sync up on Friday?
I might be able to cheat for now by extracting from the XML files and not the TSVs.
Otherwise I have to do some special case handling in the TSVs to do things like subrefs or virtual exemplars
Subrefs are still the way to handle this longer term, I think, but the token model in ATLAS is still pretty "white space word" specific.
Will need some good assertion tests for when we convert away from the ve_ref
approach
The XML files are working better; may still have some slight cleaning but we're probably like 98% there at this point.
I am hoping to deploy the Odyssey annotations later tonight.
@gregorycrane I've made a pass that brings (most of) the lemmas and postags over to the site for Odyssey, which should enable the use of the "traversal" widget:
@jtauber as we'd discussed on a call, it'd great to get your help on importing token annotations for Odyssey from the treebank.
I'm still working through a bit of "branch hygiene", but targeting a format similar to what we've done for Iliad would be great!
https://github.com/scaife-viewer/beyond-translation-site/tree/cfe2e33bdf10bfce4b16b44120aadb993a6caa81/backend/data/annotations/token-annotations/iliad-crane-shamsian
Suggested structure:
Feedback on the structure, metadata, or our use of
ve_ref
is welcome; (if, for example, we want to move fromve_ref
to the subreference-ish scheme used on scaife.perseus.org, etc).