tshatrov / ichiran

Linguistic tools for texts in Japanese language
MIT License
299 stars 33 forks source link

Support for んだ and んです suffix #32

Open Kimeiga opened 1 year ago

Kimeiga commented 1 year ago

Thanks so much for making this :)

I noticed that ichi.moe seperates ん and だ and ん and です at the end of the sentence when it seems often like they should remain together.

Consider: トムって毎朝ひげ剃ってるんだ https://ichi.moe/cl/qr/?q=%E3%83%88%E3%83%A0%E3%81%A3%E3%81%A6%E6%AF%8E%E6%9C%9D%E3%81%B2%E3%81%92%E5%89%83%E3%81%A3%E3%81%A6%E3%82%8B%E3%82%93%E3%81%A0&r=htr

JMDict has the following entry for んだ のだ, んだ (exp) the expectation is that ...; the reason is that ...; the fact is that ...; the explanation is that ...; it is that ... (の and ん add emphasis)

んです: のです, んです (exp) the expectation is that ...; the reason is that ...; the fact is that ...; the explanation is that ...; it is that ... (の and ん add emphasis)

So just wanted to report it!

krackers commented 1 year ago

It depends on how you want to parse the grammar I think. To me in のです it should in fact be separated as の + です since the の is serving a distinct function as a nominalizer. Same for ん + です which is the abbreviated version in speech. Usually at the end of the sentence it's only ever used with the copula, but this doesn't seem that different from nominalizing clauses to serve as topics or such.