Closed vcvpaiva closed 5 years ago
I understand this as a request to add the forms "scumbag", "ping pong" and "pingpong" to the appropriate synsets.
There are many others in this category, the request here is for a guideline to be established that when different spellings are used, both are listed. thanks!
I'd argue against such a guideline, which would reduce spaces and hyphens to second-class status in determining semantics and acceptability.
As far as I can tell, PWN has not made a habit of choosing one form over the other, but
rather has made each form earn its place as an accepted variant, as opposed to being
something that may have corpus hits, but is an ephemeral, accidental, or mistaken usage.
I think folks agree that WN isn't prescriptivist lexicography. Nevertheless most entries
do reflect judgements about spelling, usage, semantic granularity, and the like -- synsets
don't contain every possible form, and entries don't include every possible sense. Imho
this winnowing is a great advantage in most WN applications.
In this case, "volley ball" does appear in, say, google ngrams, but at about the same frequency as "base ball" and "basket ball", both of which I'd consider unacceptable.
Regarding "scum bag" -- which is even less common than the three ball games -- I think PWN got this one wrong. That (infrequent) spelling refers to a condom, esp. one that has been used. This was common slang usage in the 1960's; I first heard it when an older boy explained to me why his pals were snickering about the James Brown song "Papa's Got a Brand New Bag."
"Scumbag" refers to the person. This roughly parallels other -bag constructions like "dirt bag / dirtbag" and "douche bag / douchebag" (see also "The '-bag' of 'slutbag' " http://languagelog.ldc.upenn.edu/nll/?p=5560 ) . I'd add "scumbag#1" to the scum_bag#1 synset, and add a scum_bag#2 synset for the condom (as well as a douchebag#1 for the person, since "douche_bag#1" is already there for the object).
I think we need a guideline here, which is founded on the usage of frequency statistics. That is, if both forms occur frequently enough to be included then we include both if only one does then we only include that form.
volleyball vs volley ball, ping-pong vs ping pong. I believe the lexicon should have both forms, not simply choose one or the other.