jeffcarp / chinesedict

A fast web frontend to CC-EDICT
https://chinesedict2.web.app/
Other
2 stars 2 forks source link

Example phrases containing a word #33

Open brainwo opened 1 week ago

brainwo commented 1 week ago

I would like to see this being added, just like in Jisho:

image

jeffcarp commented 1 week ago

Do you mean also organizing the examples by which usage they refer to? I know Pleco has this. CEDICT doesn't have this level of information, nor does any free dictionary dataset I know of.

brainwo commented 1 week ago

It looks like HanziGraph generate example sentences using LLM. I'm not sure about the quality though.

  • OpenAI's gpt-3.5-turbo model generated example sentences for ~80,000 words and characters.

https://github.com/mreichhoff/HanziGraph/blob/53d2a949a37e729a34d32fa227b73cbbf0024612/README.md?plain=1#L122

jeffcarp commented 1 week ago

Currently the site loads the entire dictionary on the frontend since I wanted it to be as snappy as possible and ideally avoid a backend server. Adding examples would make the dictionary unsustainably large[^1], so a refactor is required. You can see more of my thinking in #15.

I started making a Go version of this app in order to accomplish this (you can try it here: https://zhongwenfyi.appspot.com/), however I'm still not fully satisfied with the UX. Moving forward, it might make sense to combine the full-dictionary frontend search from the static site version (https://chinesedict2.web.app/) into the Go version.

[^1]: In the Go version, the uncompressed dictionary size is 47MB.

jeffcarp commented 1 week ago

The Go version has examples pulled from Tatoeba, if you want to try it out (there's no formatting or anything yet): \ https://zhongwenfyi.appspot.com/word/%E5%B0%8F%E7%8B%97

Getting examples specifically for each usage of of a word (your original request) sounds like a non-trivial amount of work, but it would be very useful.

jeffcarp commented 3 days ago

Moving forward, it might make sense to combine the full-dictionary frontend search from the static site version (https://chinesedict2.web.app/) into the Go version.

Ok I did this in a very hacky way to try it out: https://zhongwenfyi.appspot.com/

Getting examples specifically for each usage of of a word (your original request) sounds like a non-trivial amount of work, but it would be very useful.

@brainwo would you have any interest in helping me aggregate all this data?