ssb22 / CedPane

Chinese-English Dictionary Public-domain Additions for Names Etc (CedPane)
The Unlicense
4 stars 1 forks source link

Word overrides 到家 #15

Closed chinese-words-separator closed 2 years ago

chinese-words-separator commented 2 years ago

Found in十分钟就到家了+Let's


The compound word 到家 is defined as

到家 到家 [dao4 jia1] /perfect/excellent/brought to the utmost degree/

But it does not fit in the context of the sentence in the YouTube video above

Adding 到家 to word segmentation overrides

By the way, if you need to collect word segmentation overrides while reading or watching with CWS, you can right-click the word(s) and choose an override, e.g.,


The user-defined segmentation overrides can be found in CWS Options screen:


Effecting 到家 segmentation split override


ssb22 commented 2 years ago

Thanks, that's a good context menu item. Incidentally the ABC Dictionary defines 到家 as "①get home ②reach a very high level; be perfect" and I can't help wondering if that definition 1 was put there like an 'override' (it's not included in the definition given by Pleco C-E or Xiandai Hanyu Guifan Cidian); unfortunately John DeFrancis passed away in 2009 so we can't now ask him what he was thinking with that ABC entry...

chinese-words-separator commented 2 years ago

If it's an override, it's pragmatic he did that. Given that software dictionary is not yet pervasive when he created his dictionary, it will help novice language learners be aware that there is another reading for 到家, which is unsurprisingly is 'arriving home'

When I tried to restore the definition of 不想 made by another CC-CEDICT contributor:

不想 不想 [bu4 xiang3] /unexpectedly/


不想 不想 [bu4 xiang3] /not want/unexpectedly/

The editor rejected it, they have this policy:

One way of explaining our policy is that we, like other dictionaries, want to be concise, and avoid the redundancy inherent in including senses like "not want" for 不想 when we already have an entry that says 不 means "not" and another entry that says 想 means "want".

But still, CC-CEDICT have some redundant entries:

不行 不行 [bu4 xing2] /won't do/be out of the question/be no good/not work/not be capable/
不用 不用 [bu4 yong4] /need not/
不要 不要 [bu4 yao4] /don't!/must not/

Given that 不想 is a common expression, I don't want language learners to get confuse with annotation that says 我不想 is I unexpectedly. Anyway, I mostly agree with the policy, but given that 不想 is more pervasive as a 'not want' expression than an 'unexpectedly' expression, they should at least solve the initial confusion from unexpectedly definition could cause to language learners. I suggested them to introduce an indicator (e.g., dangling slash at the end of definition) that a compound word can also be read in a non-combined form, but told that it is out of scope

AI is expensive, so the least we could do is to make the learners be aware that a given compound word can be interpreted in two or more ways, e.g., by providing an override definition, by splitting the word then surface the garden-paths