translatable-exegetical-tools / Abbott-Smith

Abbott-Smith's Manual Greek Lexicon
32 stars 20 forks source link

What's the headword? #105

Open jonathanrobie opened 3 years ago

jonathanrobie commented 3 years ago

In the current structure, there's no simple, direct representation of the headword. The <orth> element doesn't quite do it because it often contains additional information or hyphens. The @n attribute contains the headword plus a Strong's number, so you have to break this apart.

Can we improve this?

<entry n="α|G1">
    <form>
    <orth>Α, α, ἄλφα</orth> (q.v.), <foreign xml:lang="grc">τό</foreign>, indecl., </form>
<entry n="ἐπαιτέω|G1871">
  <note type="occurrencesNT">1</note>
  <form>
     <orth>ἐπ-αιτέω</orth>, <foreign xml:lang="grc">-ῶ</foreign>, [in LXX: <ref osisRef="Psa.109.10">Ps 108 (109):10</ref> (H7592) <ref osisRef="Sir.40.28">Si 40:28</ref> * ;]</form>
    <sense n="1.">
       <gloss>to ask besides</gloss>. </sense>
    <sense n="2.">
            <gloss>to beg</gloss> (as a mendicant; cf. MM, <emph>Exp.</emph>, xiv): <ref osisRef="Luk.16.3">Lk 16:3</ref><ref osisRef="Luk.18.35">18:35</ref> (Cremer, 74).†</sense>
        </entry>
cbearden commented 3 years ago

I take it we are limited by the TEI P5 schema. Are we talking about marking up the text just as it is, or also about adding a constructed headword for cases like "ἐπ-αιτέω"?

Of the available dictionary-related children of entry, I think form is the best fit. form does have @type, and among the suggested values for @type in the P5 docs are

form may also contain form, and it may also contain character data directly, so one possibility is to enclose the array of form-related data in a parent form and then have specific form children with types to indicate what they represent. I don't know if I like that idea, but what do you think?

The Big Liddell at the Perseus Project is marked up with entryFree rather than entry, so I don't know if it can give us much precendent in this question. The Middle Liddell uses entry, however, and here is their version of "ἐπαιτέω":

<entry key="e)paite/w" type="cv" id="n11752">
  <form opt="n">
    <orth extent="full" lang="greek" opt="n">ἐπαιτέω</orth>
  </form>
  <note anchored="yes" type="infl" place="unspecified">fut. <foreign lang="greek">ήσω</foreign></note>
  <sense level="0" n="0" id="n11752.0" opt="n">
    <trans opt="n"><tr opt="n">to ask besides</tr></trans>, 
    <usg opt="n">Il.</usg>, 
    <usg opt="n">Soph.</usg>:—so in Mid., 
    <usg opt="n">id=Soph.</usg>
  </sense>
</entry>

I haven't had a chance yet to go through all the examples for form in the P5 docs. Perhaps some good possibilities there.

jonathanrobie commented 3 years ago

Since we cannot change the surface text, I think we are limited to attributes. And the @key attribute seems good to me.

So instead of:

<entry n="ἐπαιτέω|G1871">

I would prefer:

<entry key="ἐπαιτέω' n='G1871">

I have a query that creates this. Shall I make sure it validates and submit it as a pull request?

destatez commented 3 years ago

Jonathan

I like this new attribute approach. It treats the Greek and the Strongs as separate entities and not as if they are alternatives. Go for it.

Dave

On Sat, Oct 16, 2021 at 12:04 PM Jonathan Robie @.***> wrote:

Since we cannot change the surface text, I think we are limited to attributes. And the @key attribute seems good to me.

So instead of:

I would prefer: I have a query that creates this. Shall I make sure it validates and submit it as a pull request? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub , or unsubscribe . Triage notifications on the go with GitHub Mobile for iOS or Android . --
cbearden commented 3 years ago

I agree with David. Would it make sense to accept my pull request first? I have completed checking all A-S book abbreviations inside of ref tags, but I haven't had a chance to check the @.***` values or to compare those values to the A-S ones.

Chuck

Sent from my phone; forgive the terseness.

On Sun, Oct 17, 2021, 4:10 AM David Statezni @.***> wrote:

Jonathan

I like this new attribute approach. It treats the Greek and the Strongs as separate entities and not as if they are alternatives. Go for it.

Dave

On Sat, Oct 16, 2021 at 12:04 PM Jonathan Robie @.***> wrote:

Since we cannot change the surface text, I think we are limited to attributes. And the @key attribute seems good to me.

So instead of:

I would prefer: I have a query that creates this. Shall I make sure it validates and submit it as a pull request? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub < https://github.com/translatable-exegetical-tools/Abbott-Smith/issues/105#issuecomment-944955814 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AEACF36XKOANUXHM4AF54YTUHGWC5ANCNFSM43TGABXA . Triage notifications on the go with GitHub Mobile for iOS < https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android < https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub . --

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/translatable-exegetical-tools/Abbott-Smith/issues/105#issuecomment-945077813, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADIIFUEVMXYQTBNQ232V5LUHKHJLANCNFSM43TGABXA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

jonathanrobie commented 3 years ago

Yes, I'll do that. Off to church now ...

Jonathan

On Sun, Oct 17, 2021 at 9:25 AM Charles Bearden @.***> wrote:

I agree with David. Would it make sense to accept my pull request first? I have completed checking all A-S book abbreviations inside of ref tags, but I haven't had a chance to check the @.***` values or to compare those values to the A-S ones.

Chuck

Sent from my phone; forgive the terseness.

On Sun, Oct 17, 2021, 4:10 AM David Statezni @.***> wrote:

Jonathan

I like this new attribute approach. It treats the Greek and the Strongs as separate entities and not as if they are alternatives. Go for it.

Dave

On Sat, Oct 16, 2021 at 12:04 PM Jonathan Robie @.***> wrote:

Since we cannot change the surface text, I think we are limited to attributes. And the @key attribute seems good to me.

So instead of:

I would prefer: I have a query that creates this. Shall I make sure it validates and submit it as a pull request? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <

https://github.com/translatable-exegetical-tools/Abbott-Smith/issues/105#issuecomment-944955814

, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/AEACF36XKOANUXHM4AF54YTUHGWC5ANCNFSM43TGABXA

. Triage notifications on the go with GitHub Mobile for iOS <

https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675

or Android <

https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub

.

--

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/translatable-exegetical-tools/Abbott-Smith/issues/105#issuecomment-945077813 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AADIIFUEVMXYQTBNQ232V5LUHKHJLANCNFSM43TGABXA

. Triage notifications on the go with GitHub Mobile for iOS < https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675

or Android < https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/translatable-exegetical-tools/Abbott-Smith/issues/105#issuecomment-945122929, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANPTPNSOKIBNYZ6D3A7YWTUHLFF3ANCNFSM43TGABXA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

jonathanrobie commented 3 years ago

Neither a @key attribute nor a @lemma attribute validates against the TEI P5 dictionary schema that OSIS uses. And their documentation seems to ask for @n to do what the current @n attribute does.

@pdurusau, any advice?

cbearden commented 3 years ago

In case this helps narrow the search, it appears to me as if the TEI/OSIS schema permits all the attributes shown for entry in the P5 documentation section on dictionaries:

https://tei-c.org/release/doc/tei-p5-doc/en/html/ref-entry.html

except the "responsibility" attributes:

att.global

att.global.rendition

att.global.linking

att.global.analytic

att.global.facs

att.global.change

att.global.source

att.entryLike

att.sortable

The att.global.responsibility attributes are defined in the schema, but I couldn't find a path from them to one of the references under the definition for entry. Note that I'm not all that used to XSD, so I may have missed something.

On Mon, Oct 18, 2021 at 12:50 PM Jonathan Robie @.***> wrote:

Neither a @key attribute nor a @lemma attribute validates against the TEI P5 dictionary schema that OSIS uses. And their documentation seems to ask for @n to do what the current @n attribute does.

@pdurusau https://github.com/pdurusau, any advice?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/translatable-exegetical-tools/Abbott-Smith/issues/105#issuecomment-946016050, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADIIFW7OGFT7TQRDEQKDTDUHRM5PANCNFSM43TGABXA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

pdurusau commented 3 years ago

Jonathan, if you mean to distinguish between "lemma" as a value for type, versus "lemma" as an attribute when you invoke the analysis module: https://tei-c.org/release/doc/tei-p5-doc/en/html/ref-att.linguistic.html I think we pulled key from att.canonical to provide canonical references. It's been a while but that's what I'm seeing from the TEI docs. https://tei-c.org/release/doc/tei-p5-doc/en/html/ref-att.canonical.html

jonathanrobie commented 3 years ago

@pdurusau How hard would it be to add @lemma as an attribute on the entry element?

pdurusau commented 3 years ago

Jonathan, after poking around a bit more, we do have "type" on form, which the TEI suggests:

"form (form information group) groups all the information on the written and spoken forms of one headword. @type classifies form as simple, compound, etc. Suggested values include: 1] simple; 2] lemma; 3] variant; 4] compound; 5] derivative; 6] inflected; 7] phrase"

There's not a lot of commentary in the Guidelines but you can have multiple form elements within a single entry, think OED for example, so the schema allows for you to distinguish each form by some typology.

For indexing/sorting purposes would entryFree/form[@type="lemma"]/orth suffice? It's likely more flex than we need right now but could be useful with more complex sources.

If this is acceptable, then no change required.

jonathanrobie commented 3 years ago

Hi Patrick - the basic issue is that a lot of these <orth> elements contain hyphens and such. For instance:

<entry n="ὑπόκρισις|G5272">
        <note type="occurrencesNT">10</note>
<form>** <orth>ὑπό-κρισις</orth>, <foreign xml:lang="grc">-εως, ἡ</foreign></form>

The ὑπόκρισις string in the @n attribute is what we want as the headword, and it has to occur in an attribute because we do not want to change the surface text. So I would like to have two attributes in entry, the value of one is ὑπόκρισις, the value of the other is G5272.

pdurusau commented 3 years ago

Jonathan,

Sorry if I was unclear. Thanks for the example!

What I am suggesting is:

10
** ὑπό-κρισις, -εως, ἡ
ὑπόκρισις
I may be missing what you mean by "surface" text? There is a global @resp attribute if we need to distinguish between element that capture Abbott-Smith as written and any additions that we make to the text. If that helps, we aren't limited to attributes. May help with incorporation of other content.
jonathanrobie commented 3 years ago

I don't want to add a new element with the same text, except the hyphen is removed. And if we can avoid it, I don't want to add our own elements with new text except in notes.

How about wrapping the word in a w element and adding the attribute we need there?

<entry n="G5272">
        <note type="occurrencesNT">10</note>
<form>** <orth><w lemma="ὑπόκρισις">ὑπό-κρισις</w></orth>, <foreign xml:lang="grc">-εως, ἡ</foreign></form>