mifunetoshiro / kanjium

The ultimate kanji resource
Other
276 stars 32 forks source link

fix accent for kouzuru #20

Closed tatsumoto-ren closed 2 months ago

tatsumoto-ren commented 2 months ago

Here again 5 is impossible since it's larger than the number of moras total. But both 新明解 and 大辞林 say that the right accents are 0 and 3.

mifunetoshiro commented 2 months ago

This is by the way a "raw" file that's not really used in the main kanjium database other than it was used to parse the accents out and add to words in the kanjium database itself. There are probably countless mistakes and typos, and if anything the accents in the actual kanjium database should be checked and corrected. Thank you anyway.

mifunetoshiro commented 2 months ago

After a quick check, some accents still have these appended for example: (副) (名) (代) (形動) (副;名) (感) (名;形動) (副;感) (形動;副)

tatsumoto-ren commented 2 months ago

I see. Our Anki add-on AJT Japanese uses pitch accent information from Kanjium's accents.txt file to generate its pitch accent database and pitch graphs. To parse Kanjium's data we made a program that checks the accents and complains if there is anything exceptional. I thought that it would be easier if the file was updated in the upstream because it makes parsing more straightforward.

(副) (名) (代) (形動) (副;名) (感) (名;形動) (副;感) (形動;副)

These marks are not used in AJT Japanese so our parser ignores them.

tatsumoto-ren commented 2 months ago

I have a couple more edits that I would like to share with Kanjium as well. If you're fine with it of course.

mifunetoshiro commented 2 months ago

I thought that it would be easier if the file was updated in the upstream because it makes parsing more straightforward.

This is fine, I'll merge any fixes. Thank you.