rdoeffinger / Dictionary

"QuickDic" offline Dictionary App for Android. Provided downloadable dictionaries are based on Wiktionaries but can also be created from other sources (see DictionaryPC). Remember to use --recursive when cloning! Fork of project that used to be hosted at code.google.com/p/quickdic-dictionary.
Apache License 2.0
315 stars 68 forks source link

Feature Request: Combine dictionaries #31

Open Heineken opened 8 years ago

Heineken commented 8 years ago

I have created dictionaries from dict.cc data for English, Russian, French and Spanish as they offer some words and some grammatical information which cannot be found in the standard dictionaries.

Also, they don't have the annoying "linked entries" where you have to click on a blue link and scroll a while to find the translation.

Looking up a rare word can take a while, because I have to switch between dictionaries, without knowing which one is active, before I try Leo, Google Translate or whatever.

I propose to implement an option for merging dictionaries / combined lookup. That could be automatic for all dictionaries of the same language and would thus require only one additional checkbox.

rdoeffinger commented 8 years ago

(Ignoring that I suspect this might not be that easy to implement, as having a scrollable list means you need a global index and combining multiple ones into a single combined one at runtime sounds likely expensive) I'm not really sure how you think the result of that combined lookup should be displayed? Just dumping all results would probably look a bit messy? Using multiple columns would not fit on most screens. Indicating the language for each translation in-line might waste a good bit of space.

Heineken commented 8 years ago

Well, I think the lookup should look like the lookup in a single dictionary. Identifying the origin of each entry is optional.

If there is no simple way of looking up in two list at once and it is to expensive at runtime, how about merging the dictionaries once into a new file? Maybe in a first step as a command line tool. Maybe that DictionaryBuilder.jar can be modified.

rdoeffinger commented 8 years ago

How do you create the dictionaries? If you use the DictionaryBuilder you should simply specify the sources for one as --input0 and the other as --input1 and you get both in a single dictionary.

Heineken commented 8 years ago

I have one text file with both languages and pipe that through the jar, along with an ignore list.

I can't specify an existing dictionary as a source, can I?

I need text files as input, right?

I don't have the input files for the "stock" dictionaries, do I?

rdoeffinger commented 8 years ago

Dictionaries as source is not yet possible, no. I guess it would be kind of nice, but doubt I'll have time for that. The process for generating the stock dictionaries is in the readme.txt (this repository, not the DictionaryPC, don't ask me why, I probably wasn't thinking). All data is public, it's quite a large download from wiktionary though (and sometimes they delete old database versions, the you need to use newer ones). Would probably be easiest to modify generate_dictionaries.sh to include your additional sources once you get to that step. The ca. 10 first lines of that script can be used to generate only some dictionaries.

Heineken commented 8 years ago

OK, thanks, I'll look into that.

Heineken commented 8 years ago

Hmm, I use a single jar named "DictionaryBuilder.jar" to compile my dictionaries. It works fine on Windows (except for ? instead of Cyrillic letters in the success message).

Now I looked into DictionaryPC and can't find any jar of that name. The command in the shell scripts also looks different from what I use: I call my DictionaryBuilder.jar with the same arguments the readme.txt (last line) gives for run.sh.

The stuff in DictionaryPC isn't meant to be run "natively" in Windows as I do it, or is it? If so, where do I find the current version of DictionaryBuilder.jar?

rdoeffinger commented 8 years ago

DictionaryBuilder.jar is just a compiled and bundled up version of DictionaryPC. The scripts should work fine on Windows as long as you have a POSIX shell and JDK available (usually would mean installing cygwin I guess). The run.sh could also just call a DictionaryBuilder.jar. Or you can just figure out the command-lines from the scripts. But if you use an old DictionaryBuilder.jar you will have some issues. Mostly some small bugs, especially with French but also no support for the more efficient v7 dictionary format.

Heineken commented 8 years ago

The 1st sentence doesn't make sense.

Wouldn't it make sense to provide such a bundled jar to lower the threshold for Non-Linux users who don't want to install additional software for a small script? I guess, more people have Java installed than Cygwin or Linux.

rdoeffinger commented 8 years ago

Fixed that first sentence.

Even with the jar, to use the scripts you'd still need Cygwin or Linux or such. So a jar currently only really helps the users that can figure out how to use it from scratch with no real documentation and can't or do not want to use Cygwin, Linux or manually compile or use an IDE like eclipse to compile and generate a jar themselves. So yes, a release of DictionaryPC with a compiled version would be nice but is very, very far down the list in priority.

Heineken commented 8 years ago

OK, I see. I just want to make sure you got my use case:

I don't see the necessity to build the dictionaries on my as they can be downloaded conveniently from within the app. I do, however, want to create alternative dictionaries from other sources. For this I don't need no scripts, only the command to run the jar and examples for the input file formats. I don't have to be a developer to do this, just a command line user.

Question is, if someone were to add a compiled version, could the compilation process be automatized to include later updates?