scaife-viewer / beyond-translation-site

Site used to iterate on translation alignments within the Scaife Viewer ecosystem
3 stars 4 forks source link

bug in identifying cited lemmas #120

Open gregorycrane opened 1 year ago

gregorycrane commented 1 year ago

We are missing lemmas that do show up in our lexica. Nice example:

μῆνιν in Il. 1.1: https://beyond-translation.perseus.org/reader/urn:cts:greekLit:tlg0012.tlg001.perseus-grc2:1.1-1.20?mode=dictionary-entries&entryUrn=urn%3Acite2%3Ascafife-viewer%3Adictionary-entries.atlas_v1%3Alsj-67481

Click on it and we get the LSJ entry for this word. We do find Il. 1.1 in the entry.

image

The good news is that we know that this is properly encoded in the LSJ entry because the link on the citation is the correct one: https://beyond-translation.perseus.org/reader/urn:cts:greekLit:tlg0012.tlg001.perseus-grc2:1.1

So I think we may have an ingest bug .

gregorycrane commented 1 year ago

The same thing happens for Πηληϊάδεω in Il 1.1. We can see that Il 1.1 shows up in the entry in Hompers but it is not labelled as being "cited."

image

But Ἀτρεΐδης in line 7 does show up as cited -- so we are not missing all.

gregorycrane commented 1 year ago

this belongs in the BT issues.

jacobwegner commented 1 year ago

@gregorycrane I'll look into this a bit further today as well.

For Il. 1.1, the match as "available" vs "cited" is because we can resolve the lemma for 1.1@μῆνιν, μῆνις to the LSJ entry.

The reason that "cited" is not being applied, is that the underlying XML does not cite the passage URN:

<biblScope TEIform="biblScope">762</biblScope>
                  </bibl> (lyr.): but also, generally, of the <tr opt="n" TEIform="tr">wrath</tr> of Achilles, <bibl n="urn:cts:greekLit:tlg0012.tlg001.perseus-grc1:1:1"
                        default="NO"
                        TEIform="bibl">
                     <author TEIform="author">Il.</author>
                     <biblScope TEIform="biblScope">1.1</biblScope>

Note urn:cts:greekLit:tlg0012.tlg001.perseus-grc1:1:1.

Clicking "Il 1.1" works because we're doing some cleanup on those references at a different stage in the ingestion process. I can try to do it earlier (so we're citing 1.1), but there would also need to be additional work to resolve perseus-grc1 to perseus-grc2.

I can do something in the short term, but in the long term, I'd like to surface more of these "misses" in an error log and we could make changes in the PerseusDL/lexica repository to fix them.

jacobwegner commented 1 year ago

For 1.1@Πηληϊάδεω, we have the following values:

Lemma: Πηλείδης Morphology widget: Πηληϊάδης

And two headwords:

<div xml:id="peleides-cunliffe-name" type="textpart" n="Πηλείδης">
    <head><foreign xml:lang="greek">Πηληιάδης</foreign></head>
    <p>= next.</p>
<p>Of <ref target="achilles-cunliffe-name">Achilles</ref> <bibl n="Hom. Il. 1.146">Il. 1.146</bibl>, <bibl n="Hom. Il. 15.64">Il. 15.64</bibl>, <bibl n="Hom. Il. 16.271">Il. 16.271</bibl>, <bibl n="Hom. Il. 17.105">Il. 17.105</bibl>, <bibl n="Hom. Il. 18.170">Il. 18.170</bibl>, <bibl n="Hom. Il. 19.83">Il. 19.83</bibl>, etc.: <bibl n="Hom. Od. 8.75">Od. 8.75</bibl>. </p>
</div>

<div xml:id="peleiades-cunliffe-name" type="textpart" n="Πηληϊάδης">
    <head><foreign xml:lang="greek">Πηληϊάδης</foreign></head>
    <p>Patronymic from prec.</p>
<p>Of <ref target="achilles-cunliffe-name">Achilles</ref> <bibl n="Hom. Il. 1.1">Il. 1.1</bibl>, <bibl n="Il. 1.322">322</bibl>, <bibl n="Hom. Il. 9.166">Il. 9.166</bibl>, <bibl n="Hom. Il. 16.269">Il. 16.269</bibl>, <bibl n="Il. 16.653">653</bibl>, <bibl n="Il. 16.686">686</bibl>, <bibl n="Hom. Il. 24.406">Il. 24.406</bibl>, <bibl n="Il. 24.431">431</bibl>, <bibl n="Il. 24.448">448</bibl>: <bibl n="Hom. Od. 11.467">Od. 11.467</bibl>, <bibl n="Od. 11.557">557</bibl>, <bibl n="Hom. Od. 24.15">Od. 24.15</bibl>. </p>
</div>

Like in https://github.com/scaife-viewer/beyond-translation-site/issues/121, this is an issue between the lemma used when we click on an entry. The Morpheus lemma is used when clicking, and is cited within peleiades-cunliffe-name.

The treebank lemma is not used when clicking, but resolves to peleides-cunliffe-name which does not have the citation of 1.1.