Closed JessedeDoes closed 1 year ago
Funny you should ask this because earlier today we closed #24 which dealt with the content model of <def>
. We can now use <xr>
in def.
As for <colloc>
, we also discussed this today, and I proposed that we go for <cit type='collocation'><quote>blah blah </blah></cit>
. For this we don't have to further change anything about <def>
.
So I'd recommend you give up <colloc>
and do <cit type='colloc'>
instead. The advantage of cit over colloc is twofold: 1) consistency with examples, translations etc., and 2) within cit, we can have defs, other cits etc... so if a dictionary translates or defines collocations (that was the example that started our discussions today), we can deal with it. <colloc>
is too restrictive in this sense.
@JessedeDoes I don't quite see how this is a definition in the first place. It doesn't tell a thing about the meaning of »hand«. It does give collocators for »rechterhand« and »linkerhand« respectively and it describes the extra-linguistic constraint »in speech directed to children« – which is genuinely usg
, right?
So yes, I'd opt for a different encoding here which of course is not so easy because of the discursive nature of the description. Something like:
<sense>
<usg type="socioCultural">In de taal die men jegens kinderen bezigt</usg><pc>,</pc>
wordt de rechterhand vaak
<cit type="collocation">
<quote>het mooie, schoone, fraaie, goede, zoete handje</quote>
</cit>
genoemd;
de linkerhand heet dan bij tegenstelling
<cit type="collocation">
<quote>het leelijke of verkeerde handje</quote>
</cit><pc>.</pc>
</sense>
Currently, I have no immediate idea as to the stretches of text directly inside sense
. For the etymology, we'd pull those into a cit
and use seg
to capture the »discursive glue«. So maybe an overarching cit
of some yet unclear @type
could be used to comprise the free floating text and the two proper cit/@type="collocation"
?
<sense>
<usg type="socioCultural">In de taal die men jegens kinderen bezigt</usg><pc>,</pc>
<cit type="???">
<seg>wordt de rechterhand vaak<seg>
<cit type="collocation">
<quote>het mooie, schoone, fraaie, goede, zoete handje</quote>
</cit>
<seg>genoemd<seg><pc>;</pc>
<seg>de linkerhand heet dan bij tegenstelling<seg>
<cit type="collocation">
<quote>het leelijke of verkeerde handje</quote>
</cit><pc>.</pc>
<cit>
</sense>
Or maybe I'm led a bit astray here?
Maybe the example was not especially felicitous here. The general idea is that the <def>
defines the collocations that it contains, a bit like the following wikipedia example
<def>
In the English language, <colloc?>black sheep</colloc> is an idiom used to describe an odd or disreputable member of a group, especially within a family.
</def>
An option using nested entries would be something like this: (Katrien would disagree)
<entry>
<form type='lemma'>sheep</form>
<entry type='mwe'>
<def>
In the English language, <form type='lemma'>black sheep</form> is an idiom used to describe an odd or disreputable member of a group, especially within a family.
</def>
</entry>
</entry>
We see this more as a cit type example or cit type of some sort, and not a proper definition. The part "odd or disreputable member of a group, especially within a family" is a definition, but since it's part of this more narrative structure, we don't think the whole sentence should be encoded as a definition.
In some dictionaries (WNT for instance), we have forms and collocs embedded in definitions. Example:
What would you suggest? Broadening content model of def or different encoding?