Open mulliganaceous opened 9 months ago
Hi @mulliganaceous
To me as a native Vietnamese speaker "nghiễng" seems strange. I cannot find it in an online dictionary I am maintaining now:
https://tudien.net/?word=nghi%E1%BB%85ng&dictionary=vi-vi
"nghiễng" in the image above seems to be extracted from "tam thiên tự" (字學纂要 ), I might be wrong as I only remember a few words. "Nghiễng", I guess, is not Vietnamese.
Vietnamese has less than 7000 syllables so the data you mentioned about are not correct.
I personally believe that "nghiêng" is only the syllable that has 7 letters.
(Original source: Reddit)
After examining automatic language detectors and longest words, the case for Vietnamese is special, mainly due to that this language is composed almost entirely of short words no more than 6 letters in length. Occasional longer are almost exclusively seen in loanwords.
According to Wikipedia, the longest word (in this definition), is nghiêng, meaning 'inclined'. What strikes me is the wording of this statement, as it implies that nghiêng is the only native Vietnamese word with 7 letters, and that there are no native Vietnamese word with 8 or more letters. There are hundreds of different Vietnamese words with 6 letters, suc as Nguyễn (much more common than Smith), trưởng (chief), khuynh (inclined). Is it true that nghiêng and its tonal counterparts the only seven-letter native Vietnamese words?
Research
Technically, Vietnamese separates strings of letters at a morpheme level, and each morpheme is a syllable in Vietnamese. To the uninitiated, it seems that every native Vietnamese word is of one syllable.
There is an online resource which lists all native Vietnamese words (technically, single-syllable morphemes) of the Vietnamese language. I ran a simple Python program that sorts and categorizes each Vietnamese word by length. I used three lists that are used in actual programs or research projects (7184-source, 7884-source, all syllables). Here are my results:
There is clear evidence that nghiêng is the one 7-letter native Vietnamese word. In the 7884-source, the seven-letter words are 'kilôgam', 'kilômet', 'nghiêng', 'nghiênh', 'nghuếch', 'đpctntư'. The first two are clearly loanwords, the fourth and fifth are probably misspelt. The last is nonsense. In the all syllables list, the six seven-letter words are all tonal equivalents of nghiêng.
Another seven-letter Vietnamese word
After browsing through various chu nom dictionaries, I finally spotted a second example of a native Vietnamese word with seven letters. It is again a tonal equivalent of _nghiêng_, this time with the _ngã_ tone: _nghiễng_. It is sourced from _Tam Thiên Tự_, and _nghiễng_ even has its chu nom counterpart: 覡 (meaning 'wizard'). I found this source from Facebook.
Conclusion
As of now, I found another word, along with its chu nom counterpart, composed of seven letters. I thought nguyêng, nghiêch, thuyêng, seem plausible, but I don't see any evidence of their existence. Please comment if you believe that nghiêng are the only seven-letter native Vietnamese words, or if there are evidence of the contrary.