danielt998 / HanziToAnki

This is a program that takes a Chinese text as input and converts it to an Anki Deck
MIT License
19 stars 0 forks source link

Support other dictionaries #69

Open danielt998 opened 4 years ago

danielt998 commented 4 years ago

We should support other Dictionaries, for example CedPane and CC-CANTO

james-s-w-clark commented 4 years ago

How would the user choose? Would we supply multiple definitions (which may be very similar)? If CC-CEDICT has traditional characters too, what does CC-CANTO bring to the table?

james-s-w-clark commented 4 years ago

When talking about other dictionaries, it also makes me question if we should change our Word record from:

I was using this approach for GradedReaderBuilder to try to give general language support. If we adopted more general phrasing and changed the project name, we could generalise to any dictionary and any input text (if we have material with suitable licenses).

danielt998 commented 4 years ago

How would the user choose?

for something like Cedpane it would be in addition to CC-CEDICT though there could be an option to enable/disable certain ones For Cantonese, it would be a whole different thing, you'd want an option on the main screen for Cantonese/Mandarin (and maybe different romanisation systems too)

Would we supply multiple definitions (which may be very similar)?

I don't know, maybe so?

If CC-CEDICT has traditional characters too, what does CC-CANTO bring to the table?

Umm you know, CANTONESE PRONUNCIATION

james-s-w-clark commented 2 years ago

I think if users are interfacing with the backend through a web app, they could do checkboxes for which pronounciations they'd like (Mandarin, Cantonese, etc.). The backend would just have all dictionaries (so long as the licenses are OK), and would add one or more pronounciation fields based on the user's selections on the web form.

I think let's do this later, but would be fun to first enrich our Words and then to enrich the flashcard output

james-s-w-clark commented 2 years ago

Sounds like we may need a generic Extractor/Dictionary interface ;)