Dushistov / sdcv

https://dushistov.github.io/sdcv/
GNU General Public License v2.0
294 stars 42 forks source link

Does not display all of the dictionary's results #30

Closed InterAl closed 2 years ago

InterAl commented 6 years ago

Seems like the program does not return all values from the dictionary.

For example, there are multiple values for "rock" (noun, verb, etc) in the Webster's 1913 dictionary, but only the first result is displayed.

I tried with 2 other dictionaries as well, and the problem persisted.

Dushistov commented 6 years ago

Yes, it is how it works now, it just pick the first article/translation and return it back, of course it would be better take all article/translation from the same dictionary.

InterAl commented 6 years ago

I think it's very important, as I don't get the intended result (the definition of either a noun or a verb) quite frequently.

Frenzie commented 6 years ago

@InterAl A temporary workaround was posted here: https://github.com/koreader/koreader/issues/2951#issuecomment-312947486

Dushistov commented 6 years ago

But what the use case for such dictionaries? Why not just merge all translations with one key into one on the creating dictionary stage?

InterAl commented 6 years ago

If the dictionaries merged them beforehand, they would need to format the merged string themselves. When the definitions are separate, each client can implement its own formatting.

TnS-hun commented 6 years ago

Another real-life use case is when two words only differ by case in a dictionary.

Example from Wiktionary: car: A wheeled vehicle that moves independently, with at least three wheels, powered mechanically, steered by a driver and mostly for personal transportation; a motorcar or automobile. CAR: Initialism of Central African Republic.

I think returning such words like this would be perfect:

[{"dict": "Wiktionary","word":"car","definition":"..."},
{"dict": "Wiktionary","word":"CAR","definition":"..."}]
con-jo-ry commented 3 years ago

I use a number of specialised dictionaries in Stardict format that all tend to have multiple entires with the same headword. It would be great if this functionality could be added to sdcv; I would very much like to use it over Goldendict.

daniesteban commented 3 years ago

I'm really interested too. I use translations dictionaries in koreader and just get the first definition :-1:

cyphar commented 2 years ago

I started taking a look a this. While it looks trivial at first glance, a lot of the other code really depends on the special return values from ::lookup() when the word isn't found which means making it append each index to an array is less trivial than it first appears (you could implement support for it in the caller of idx_file->lookup but because syn_file->lookup returns the dict index rather than the syn_file index, it needs to be implemented in at least one of the lookup methods). Maybe if we took glong &next_idx as well as std::vector<glong> &idxs then it wouldn't be too bad to implement.

Basically the issue is that StarDict works by having a index you binary search to find the entry offset for the main dictionary, but sdcv doesn't check that there are other lexically identical strings next to the one it finds in the search (meaning that the result you get is actually arbitrary -- adding an entry somewhere else in the dictionary would change which entry you get for an unrelated word lookup). The solution is conceptually simple (at the end of the binary search, walk back in the index until you hit a word that doesn't match the search string then walk forwards and return an array of the indexes you found -- same for the synonym index and every other kind of index) but because the lookup function is used as a way of getting the "next" result (which includes some magic values like 0 and INDEX_INVALID) this becomes a bit more complicated.

cyphar commented 2 years ago

78 submitted which should fix this issue.