Open phoenix-mossimo opened 4 years ago
In TEI Lex-0 Etym (still to be published), we would do this in etymology.
Here's an example with "Brexit"
<entry xml:lang="en">
<form type="lemma">
<orth>Brexit</orth>
</form>
<gramGrp>
<gram type="pos">noun</gram>
</gramGrp>
<sense>
....
</sense>
<etym type="portmanteau">
<lbl>portmanteau of</lbl>
<cit type="etymon">
<form><orth>Britain</orth></form>
<gramGrp>
<gram type="pos">noun</gram>
</gramGrp>
</cit>
<pc>+</pc>
<cit type="etymon">
<form><orth>exit</orth></form>
<!-- add pron? -->
<gramGrp>
<gram type="pos">verb</gram>
</gramGrp>
</cit>
<seg type="desc">Formed by analogy with the earlier coined
term (but unrealized event) of <xr type="crossReference">Grexit
</etym>
</entry>
I guess an important issue is are you encoding a print source (and thus
need to keep the original ordering or content) or is it a born digital
source where you can chose how to structure it? If it's the first and you
need to present that info how it is in your example, nesting
Also in Lex-0 we use
On Tue, Jan 28, 2020 at 7:16 PM Maxim Kupreyev notifications@github.com wrote:
Is there a LEX-0 conform way to encode the portmanteau forms?
In Coptic dictionary we have a number of cases when one indivisible form corresponds to what normally constitutes two grammatical categories with overlapping properties. For example, mono-morphemic possessive pronouns are originally poly-morphemic, composed of two roots: demonstrative + (more ancient) possessive pronoun (i.e. "your house" is actually "this" + "your"
- "house"). In Coptic both roots are still clearly distinguishable, but the grammaticalization process has embraced 1st and 2nd person which has fused into one form, (i.e. creating a kind of "thior" house.)
As grammatical information contained in fused morphs might be overlapping (i.e. possessor is feminine, possessed is masculine) we decided to create the nested tags, which encode separately possessor and possessum, e.g.:
Demonstrative m. sg. Possessive suffix pronoun 2. Pers. f. sg. Is this OK from the LEX-0 point of view?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/DARIAH-ERIC/lexicalresources/issues/78?email_source=notifications&email_token=ABYQ2HHZW4J75FTSUJBIG3DRABY6XA5CNFSM4KMW7JEKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IJJXKRA, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABYQ2HB3N5MYCURFWDU54HLRABY6XANCNFSM4KMW7JEA .
Hm, <etym>
section is certainly an option, but the problem is that the discrepancy in grammatical information applies to the current form. In your example "Brexit" is clearly a noun <gram type="pos">noun</gram>,
which derives from verb and noun (encoded in <etym>
section). In our case "pa" belongs to two grammatical categories - it is a demonstrative and a possessive pronoun.
Is there a LEX-0 conform way to encode the portmanteau forms?
In Coptic dictionary we have a number of cases when one indivisible form corresponds to what normally constitutes two grammatical categories with overlapping properties. For example, mono-morphemic possessive pronouns are originally poly-morphemic, composed of two roots: demonstrative + (more ancient) possessive pronoun (i.e. "your house" is actually "this" + "your" + "house"). In Coptic both roots are still clearly distinguishable, but the grammaticalization process has embraced 1st and 2nd person which has fused into one form, (i.e. creating a kind of "thior" house.)
As fused morphs belong to different grammatical categories we decided to create nested tags, which encode separately possessor and possessum information, e.g.:
<gramGrp>
<gramGrp><pos>Demonstrative</pos><gen>m.</gen><number>sg.</number></gramGrp>
<gramGrp><pos>Possessive suffix pronoun</pos><subc>2. Pers.</subc><gen>f.</gen><number>sg.</number></gramGrp>
</gramGrp>
Is this OK from the LEX-0 point of view?