Feature request: Part of speech separation

BoboTiG / ebook-reader-dict

Finally decent dictionaries based on Wiktionary for your beloved eBook reader.

http://www.tiger-222.fr/?d=2020/04/17/22/14/21-un-dictionnaire-alternatif-et-complet-pour-votre-liseuse

MIT License

386 stars 21 forks source link

Feature request: Part of speech separation #1149

Open chopinesque opened 2 years ago

chopinesque commented 2 years ago

Currently, all definitions for each entry are in one list regardless of part of speech (verb, noun, etc). It would be good if at least there was an option to have them grouped by part of speech for easier accessing of information.

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

lasconic commented 2 years ago

A bit like https://github.com/BoboTiG/ebook-reader-dict/issues/1104, it would be possible to do that if we could store several (word + definition) for a given word. Right now, Words is a dict(str, Word) and Word a list of etymologies, a single part of speech, pronunciation and a list of definition. Somehow we would need to have Words = dict[str, list(Word)]... Also we would need to match the etymology to the right part of speech...

chopinesque commented 2 years ago

I think that etymology should/would appear at the top no matter what. Mind you, I think etymology too could benefit from an option not to include it in the final file as most people are not really that interested about it.

Of course, you are the expert on the coding aspect and the feasibility thereof, but as a daily professional dictionary user I can see great benefits in part of speech (POS) sense grouping. At the end of the day, they are already grouped as such in Wiktionaries.

Another option would be to have a prefix in each sense with the POS (as to avoid using different headers per POS) and have them sorted on that prefix. Not sure if that makes any sense or whether it would be easier programmatically.

BoboTiG commented 2 years ago

There is definitely room for improvements. Separating POS would be a good thing and would be more aligned with the Wiktionnary output. @lasconic idea seems interesting, it would help with the current issue, and #1104 indeed. Then it would be a matter of adapting the HTML template (maybe switching to Jinja2 at the same time?).

Moonbase59 commented 2 years ago

I absolutely agree with @chopinesque.

Fortunately, I have a few dictionaries installed (using GoldenDict on Linux), so I append an HTML output showing how others do the part-of-speech separation for the word "test".

Of course I prefer mine ("Wiktionary (De-De)"), hee hee, and ours (this project’s) is under "Wiktionary EN-EN".

test.html.zip

victornove commented 2 years ago

Hi, My two cents on the topic. In my first attempt, I added the part of speech to each definition of the word. It seemed to be the simplest and most flexible way to do it. For the languages I checked the POS was usually in the section title, but for german and russian I just injected it into the section title from previous/higher sections in the hierarchy. This only solves half of your problem though, since you need to render it in a dictionary afterwards.

MolotovCherry commented 1 year ago

I want to say, this project is awesome. I absolutely love it. But regarding what the other person said about etymology, I have to say I also agree. It is getting in the way of my reading, and I would just like to see definitions with no etymology or pronunciation. As a result I was forced to use a different project which shows only definitions, but I'd like to move to this project in the future

BoboTiG commented 1 year ago

Then, let's generate additionnal dictionnaries without etymology. I'll try something to see how it fits.

BoboTiG commented 1 year ago

You can try etymology-free dictionaries right now ;)

MolotovCherry commented 1 year ago

You can try etymology-free dictionaries right now ;)

Really appreciate it! Thanks a lot!