Open nschneid opened 11 months ago
Yes, @nschneid , I think this would be a great idea. It might not be as significant in well-punctuated texts where mentions are set off with quotation marks and there is an escort "the word" present, but it would definitely be useful, even with the examples given in the guidelines (repeated here)
"Yes": Yes, I think so.
I am waiting for his ‘yes’ on the matter.
"precede":
Such discussion must precede every decision.
He pronounced ‘precede’ in a funny way.
So essentially, we might benefit from situations and language where there is no other marking than "Mentioned=Yes".
Sure, why not? The only issue I see is implementing it, since if the feature appears in some places, people might get the idea that the entire dataset is annotated for it exhaustively.
BTW there are tons of these and also borderline cases in the dictionary genre in UD_English-GENTLE (three of the documents are literally dictionary entries, incl. things like etymology and cognates, but also example usage, which may or may not be considered metalinguistic)
The guidelines have a policy that metalinguistic mentions of words should be tagged the same as if they were uses.
I propose an experimental MISC feature
Mentioned=Yes
to make cases of mentioned language explicit. Some examples in EWT where this would clearly apply (found by searching for "the word"):Note that this would be distinct from quotations reflecting a real or hypothetical speech act.
Mentioned=Yes
is for linguistic expressions referred to as entities, typically treated syntactically like nominals (even if the UPOS is something else).