JMdictProject / JMdictIssues

JMdict Japanese dictionary - lexicographic, etc. issues management
16 stars 1 forks source link

形容動詞 as nouns #37

Closed JMdictProject closed 1 year ago

JMdictProject commented 2 years ago

In the 余計 entry (see: https://www.edrdg.org/jmdictdb/cgi-bin/entr.py?svc=jmdict&sid=&q=1544090), Robin commented: "Despite what the kokugos say, I don't think any of these senses are nouns." I commented: "I quite agree. Many 形容動詞 get [名] tags in the kokugos despite them demonstrably only being used as adjectives in modern Japanese. I think this is due to them conforming to "classical" grammatical concepts. We tend to tag them as "adj-na,n", I would be quite happy to cull the "n" where appropriate."

So, do we adopt a policy of making those "ナリ|名" terms just "adj-na when there's no evidence of them used as nouns? Or do we play safe and continue to tag an "n" on the end of the POS because the kokugos say so?

robinjmdict commented 2 years ago

I think that's sensible. However, there's almost always at least some evidence of noun usage for 名-tagged 形容動詞. Searching "余計を" on Google books returns a few dozen results.

We'd need to decide on some sort of threshold. In the case of 余計, the combined counts for 余計が, 余計を and 余計は are less than 0.04% of the total 余計 counts. I interpret that to mean that noun usage is either non-standard, obsolete or just simply incorrect, and therefore the word should not be tagged as a noun. But what it if was 0.4%? Is that enough to warrant a "n" tag? Where's the cutoff?

Marcusjmdict commented 2 years ago

Are there any words that we could agree are non-controversially both adj-na,n? If so we could maybe figure out what the typical adj-na n ratio for such words might be. (I can't think of any)

JMdictProject commented 2 years ago

A possible example is 艶美 (えんび) (adj-na,n) beauty; charm; 162435 The references all have it as both noun and な adjective. The n-grams say: 艶美 | 7364 艶美は | 130 艶美が | 69 艶美な | 1173 艶美の | 762 艶美と | 133 艶美さ | 87 艶美に | 158 艶美を | 356 It seems the noun-ish usage is about 1/4 of the adjectival/adverbial usage. I think "adj-na,n" is a good POS for it Maybe a 3:1 or 4:1 ratio would be a good indicator for this sort of thing.

JMdictProject commented 1 year ago

This has been quiet for a year and can probably close. I think we're agreed that if the ~な/~が ratio is about 4:1 or 3:1 then [adj-na,n] is appropriate, but if the clear noun usage is lower than that we can go with just [ajd-na]