vietnameselanguage / syllable

List of the current Vietnamese syllable
https://vietnameselanguage.org/
4 stars 0 forks source link

Is 'nghiêng' the only seven-letter Vietnamese word? #1

Open mulliganaceous opened 9 months ago

mulliganaceous commented 9 months ago

(Original source: Reddit)

After examining automatic language detectors and longest words, the case for Vietnamese is special, mainly due to that this language is composed almost entirely of short words no more than 6 letters in length. Occasional longer are almost exclusively seen in loanwords.

According to Wikipedia, the longest word (in this definition), is nghiêng, meaning 'inclined'. What strikes me is the wording of this statement, as it implies that nghiêng is the only native Vietnamese word with 7 letters, and that there are no native Vietnamese word with 8 or more letters. There are hundreds of different Vietnamese words with 6 letters, suc as Nguyễn (much more common than Smith), trưởng (chief), khuynh (inclined). Is it true that nghiêng and its tonal counterparts the only seven-letter native Vietnamese words?

Research

Technically, Vietnamese separates strings of letters at a morpheme level, and each morpheme is a syllable in Vietnamese. To the uninitiated, it seems that every native Vietnamese word is of one syllable.

There is an online resource which lists all native Vietnamese words (technically, single-syllable morphemes) of the Vietnamese language. I ran a simple Python program that sorts and categorizes each Vietnamese word by length. I used three lists that are used in actual programs or research projects (7184-source, 7884-source, all syllables). Here are my results:

Length 7184-source 7884-source All syllables
1 48 74 60
2 855 1028 1216
3 2937 3172 5708
4 2372 2560 6872
5 832 887 3442
6 139 157 670
7 1 6 6
8+ 0 0 0

There is clear evidence that nghiêng is the one 7-letter native Vietnamese word. In the 7884-source, the seven-letter words are 'kilôgam', 'kilômet', 'nghiêng', 'nghiênh', 'nghuếch', 'đpctntư'. The first two are clearly loanwords, the fourth and fifth are probably misspelt. The last is nonsense. In the all syllables list, the six seven-letter words are all tonal equivalents of nghiêng.

Another seven-letter Vietnamese word

nghiễng

After browsing through various chu nom dictionaries, I finally spotted a second example of a native Vietnamese word with seven letters. It is again a tonal equivalent of _nghiêng_, this time with the _ngã_ tone: _nghiễng_. It is sourced from _Tam Thiên Tự_, and _nghiễng_ even has its chu nom counterpart: 覡 (meaning 'wizard'). I found this source from Facebook.

Conclusion

As of now, I found another word, along with its chu nom counterpart, composed of seven letters. I thought nguyêng, nghiêch, thuyêng, seem plausible, but I don't see any evidence of their existence. Please comment if you believe that nghiêng are the only seven-letter native Vietnamese words, or if there are evidence of the contrary.

yeungon commented 9 months ago

Hi @mulliganaceous

To me as a native Vietnamese speaker "nghiễng" seems strange. I cannot find it in an online dictionary I am maintaining now:

https://tudien.net/?word=nghi%E1%BB%85ng&dictionary=vi-vi

"nghiễng" in the image above seems to be extracted from "tam thiên tự" (字學纂要 ), I might be wrong as I only remember a few words. "Nghiễng", I guess, is not Vietnamese.

Vietnamese has less than 7000 syllables so the data you mentioned about are not correct.

I personally believe that "nghiêng" is only the syllable that has 7 letters.