batterseapower / pinyin-toolkit

A plugin for the Anki Spaced Repetition System (http://ichi2.net/anki/)
http://batterseapower.github.com/pinyin-toolkit/
39 stars 14 forks source link

Automatically download dictionary updates #17

Open batterseapower opened 15 years ago

batterseapower commented 15 years ago

A bit tricky because there is no stable URL for e.g. CFDICT

Nick3C commented 15 years ago

CEDICT has one though (and as it is doing the pinyin lookups it remains the most important to keep up to date :)

http://www.mdbg.net/chindict/export/cedict/cedict_1_0_ts_utf-8_mdbg.zip

Nick3C commented 15 years ago

Probably not too bad. We can just pull the url: http://www.chinaboard.de/chinesisch_deutsch.php?mode=dl and use regex to return the text matching: http://www.chinaboard.de/handedict/handedict-[8 numbers].zip

Then we have a URL for it. CFEDICT is similar I think.

Once we have the file we just:

cburgmer commented 15 years ago

You might want to have a look at class DictionaryDownloader from http://code.google.com/p/eclectus/source/browse/trunk/eclectusqt/update.py

You need to exchange the KDE io backend with the pure python ones though.

Nick3C commented 15 years ago

ah, that's fantastic.

batterseapower commented 15 years ago

I've added a small script to do the download to the repo, in pinyin/dictionaries/downloader.py.

It still needs to:

So quite a lot still to do.

cburgmer commented 15 years ago

Sorry, I forgot about this topic. I actually extracted the base functions mysqelf some weeks ago. I have put that into http://code.google.com/p/cjklib/source/browse/trunk/test/download.py.

Have you seen the table UpdateVersion that Eclectus writes? It might be helpful to make a separtate library out of all that.