Closed raffazizzi closed 9 years ago
Logged In: YES user_id=1021146
If you simply want this facility in order to provide different rendering styles, then the rend attribute should be used.
The purpose of <mentioned> is to distinguish cases of "mention" from use. If you additionally want to say something about the ontological status of the mentioned entity, the way to do it is by embedding a more ontologically- specific tag (e.g. <term>, <ident>) within the <mentioned> tag, I think. (And <ident> does have a type attribute)
Original comment by: lb42
Logged In: YES user_id=950793
The value of the proposed "type" attribute should not specify the rendition. (The examples in my proposal above only gave typical renditions.) In fact, the rendition of, say, a <mentioned type="graphs"> element can vary according to style or context (e.g. block vs. inline).
Of course, you can always specify the ontological type of the mentioned entity by some subelement. Personally, I'd prefer <seg> to <ident> because <seg> also allows for non-PCDATA content.
However, I'd like to argue that an (optional) type distinction on <mentioned> makes sense. A mention always involves mentioning something; and mentioning a word, say, qua graphic entity or qua graphematic entity results in different mentions. Distinct rendition conventions only reflect these type distinctions.
Original comment by: nolda
Original comment by: lb42
Logged In: YES user_id=686243
Maybe I'm being dense, but when is a grapheme or a phoneme or a phone ever actually used as opposed to mentioned? That is, isn't some element (<seg type="phoneme"> jumps to mind for interchange, although one might prefer to have a <phoneme> element for local usage) that indicates its content is a phoneme sufficient without being inside a <mentioned>? Perhaps some examples will help straighten me out.
Original comment by: sydb
Original comment by: lb42
Logged In: YES user_id=950793
So far three different approaches have been suggested on this page:
In my view, (3) does not express that "Karl" is mentioned as opposed to being used. <seg> just attaches some type to its content and is neutral with respect to the mention/use distinction. (Cf. a corpus encoder tagging some lemma by <seg> with part of speech-information: he does not imply that the lemma is mentioned, instead of being used, in its context.)
As to (2), that method is more verbose than (1). What is more, (2) is also less restrictive than (1). As far as I can see, a string is always mentioned either as a sequence of graphemes or as a sequence of phonemes or as whatever; there are no 'mixed' real-world examples like:
<mentioned> <seg type="phonemes">Karl</seg> <seg type="graphemes">kommt</seg> </mentioned>
In sum, I'd opt for method (1).
Original comment by: nolda
Please add an optional "type" attribute to <mentioned>, specifying the ontological or linguistic type of the mentioned entity.
Sample value could include:
* term (typically rendered in quotes)
Original comment by: nolda