batterseapower / pinyin-toolkit

A plugin for the Anki Spaced Repetition System (http://ichi2.net/anki/)
http://batterseapower.github.com/pinyin-toolkit/
39 stars 14 forks source link

Add Cantonese Support (as per Damien's Request) #67

Open Nick3C opened 15 years ago

Nick3C commented 15 years ago

The next version of Anki will remove all language-specific features in favour of plugins. The Mandarin plugin does a much better job of generating readings. Likewise, I'm looking at moving the Japanese reading support from Kakasi to a better reading generator with a larger dictionary. These larger dictionaries result in better performance but they increase the size of the Anki download, and it's not really fair to users who don't study Chinese or Japanese. New users will be able to go to File>Download>Shared Plugin and search for 'japanese support', 'mandarin support', etc - and of course the documentation will need to be updated for this.

batterseapower commented 15 years ago

I did have a look at CJKlib in more detail a few days ago. The code generally looks to be of a high quality, and although it is new it's probably less buggy than anything we could write ourselves - and we can always contribute fixes upstream.

Still, integrating it will not be straightforward, and it doesn't e.g. do anything for us WRT tone sandhi support.

Nick3C commented 15 years ago

Well you saw my email. I do think it is a good idea to do this. But I would prefer to do it eventually rather than jump into it at this stage. At very best it could be 0.06 but I have flagged it for consideration in 0.07 to give us a bit more breathing space and we'll see how things go. Would you agree with that?

It sounds like CJKlib is probably the way to go; aside from Cantonese it gives us access to a whole range of other readings. It looks like it might handle our dictionary lookup from CEDICT and handedcit even more efficiently that how we are handling it. It seems to use an SQL databased dictionary to do this.

It would also handle stroke order more elegantly than your other plugin and enable us to break the character into components (and make cards of those) which would be very cool too. It may well (but I can't find reference to it) allow us to remove tone marks to get standard pinyin (something we otherwise need to implement anyway).

Certainly this is far from trivial changes so it's not going to do us any good to rush.