Open jacobwegner opened 1 year ago
@jtauber:
We had introduced a "normalized" version of the entry headword:
If I use the "display" value instead of the normalized value, things get cluttered:
I can make use of the "display" version when choosing a "sibling":
Any thoughts?
I'll get a deploy done soon so you can play around with this some more...
(To review character stripping)
@jtauber: Here is a better explanation of what is going on with δελφῖνάς
in Odyssey 12.
If you click through to load this query:
https://tinyurl.com/gh-bt-142-sample
You can see that headwordNormalizedStripped
for LSJ, Cunliffe and Cambridge is stored as δελφις
.
headword
is provided directly from each lexicon.
headwordNormalized
is computed in normalized_no_digits:
ἄωρος1
vs ἄωρος2
, etc)headwordNormalizedStripped
is computed in normalize_and_strip_marks:
UNICODE_MARK_CATEGORY_REGEX
θεά1
and θέα2
in LSJ are distinct)Beyond Translation is currently using headwordNormalized
for the lookups; I believe this was done to avoid the exact kind of error where we might resolve both θεά
and θέα
within LSJ.
We're performing the exact same normalization from headwordNormalized
on the search term provided by a user on the frontend.
So, back to δελφῖνάς in Od. 12:
δελφίς
headwordNormalized
form for the Cambridge Greek Lexicon is δελφῑ́ς
headword
in the file you're providing for Cambridge Greek Lexicon δελφίς
or δελφίς
, the headwordNormalized
would then become δελφίς
headwordDisplay
could continue to have δελφῑ́ς
Does that make sense to you? I have some additional things I'd like to document around this, but I think having this new headwordDisplay
option will be a big help going forward.
(We should review this for Cambridge and Lexicon Thucydideum, as well as replicating what the "word study tool" does for lookups https://www.perseus.tufts.edu/hopper/morph?l=%CF%84%CE%B1%CF%81%CE%AC%CF%83%CF%83%CF%89&la=greek)
See our LSJ entries for
ἄωρος
inurn:cts:greekLit:tlg0012.tlg002.perseus-grc2:12.89
:https://beyond-translation.perseus.org/reader/urn:cts:greekLit:tlg0012.tlg002.perseus-grc2:12.89?mode=dictionary-entries&entryUrn=urn%3Acite2%3Ascafife-viewer%3Adictionary-entries.atlas_v1%3Alsj-18938