Open brainwo opened 1 week ago
Do you mean also organizing the examples by which usage they refer to? I know Pleco has this. CEDICT doesn't have this level of information, nor does any free dictionary dataset I know of.
It looks like HanziGraph generate example sentences using LLM. I'm not sure about the quality though.
- OpenAI's
gpt-3.5-turbo
model generated example sentences for ~80,000 words and characters.
Currently the site loads the entire dictionary on the frontend since I wanted it to be as snappy as possible and ideally avoid a backend server. Adding examples would make the dictionary unsustainably large[^1], so a refactor is required. You can see more of my thinking in #15.
I started making a Go version of this app in order to accomplish this (you can try it here: https://zhongwenfyi.appspot.com/), however I'm still not fully satisfied with the UX. Moving forward, it might make sense to combine the full-dictionary frontend search from the static site version (https://chinesedict2.web.app/) into the Go version.
[^1]: In the Go version, the uncompressed dictionary size is 47MB.
The Go version has examples pulled from Tatoeba, if you want to try it out (there's no formatting or anything yet): \ https://zhongwenfyi.appspot.com/word/%E5%B0%8F%E7%8B%97
Getting examples specifically for each usage of of a word (your original request) sounds like a non-trivial amount of work, but it would be very useful.
Moving forward, it might make sense to combine the full-dictionary frontend search from the static site version (https://chinesedict2.web.app/) into the Go version.
Ok I did this in a very hacky way to try it out: https://zhongwenfyi.appspot.com/
Getting examples specifically for each usage of of a word (your original request) sounds like a non-trivial amount of work, but it would be very useful.
@brainwo would you have any interest in helping me aggregate all this data?
I would like to see this being added, just like in Jisho: