cschiller / zhongwen

Official source code of the "Zhongwen" Chrome extension
https://chrome.google.com/webstore/detail/zhongwen-chinese-english/kkmlkkjojmombglmlpbpapmhcaljjkde
GNU General Public License v2.0
322 stars 54 forks source link

Adding custom dictionaries #31

Closed johnbeard closed 4 years ago

johnbeard commented 4 years ago

I often find words I want translated later, but which aren't in CC-CEDICT. Primarily technical words, but also various neologisms and place names.

It would be nice to be able to collect these in a separate dictionary file, so that they can later be submitted to CC-CEDICT (or if not suitable for CC-CEDICT, kept locally).

cschiller commented 4 years ago

Chrome extensions cannot - and for security reasons should not - read your local files. So I'm not sure how this is supposed to work. I would recommend submitting these entries to CC-CEDICT. They will then be picked up as part of a subsequent Zhongwen release.

johnbeard commented 4 years ago

CC-CEDICT doesn't always accept entries that I'd like to have available. Even when they do, it's very slow - 90% of my recent submissions are over 10 days old. By the time they get into CC-CEDICT and then by the time Zhongwen inhales the new file, whatever I was trying to use it for happened weeks ago.

You could, as a simple implementation, add them though a UI somewhat like the Word List, the difference being that you can enter your own definitions. Then allow import/export from that page.

Technically speaking, you can access local files with a native client, but that's rather clunky, and of questionable security.

cschiller commented 4 years ago

It would be nice to be able to collect these in a separate dictionary file, so that they can later be submitted to CC-CEDICT (or if not suitable for CC-CEDICT, kept locally).

I would suggest you use something like Google Docs for keeping a list of those words. You can then submit a batch of your entries and once they have been reviewed they will eventually show up in Zhongwen. I realize that this is not instantaneous and there will always be words, like place names, that will never make it into CC-CEDICT.

For security reasons I'm not willing to add support for custom dictionaries. All these entries would have to be sanitized to avoid any of the well-known security problems. Also you would have to be very careful to use the correct format. I already clean up the built-in CC-CEDICT dictionary because it contains a number of entries that Zhongwen cannot handle. I'm sorry, it's a tough decision, but in this case it's a feature I'd rather not include.

johnbeard commented 4 years ago

I understand not allowing inhaling files directly off disk, but you could still have a UI that allows you to enter custom definitions in some kind of grid, a bit like the word list UI. It could basically be the word list with the following differences:

danielt998 commented 4 years ago

maybe this is the wrong issue for this, but have you considered adding other free/open source dictionaries such as CedPane, CFDict and CC-CANTO?

cschiller commented 4 years ago

There is a French version of Zhongwen that uses CFDICT. You can find it in the Chrome Web Store.

atse commented 4 years ago

I understand not allowing inhaling files directly off disk, but you could still have a UI that allows you to enter custom definitions in some kind of grid, a bit like the word list UI. It could basically be the word list with the following differences:

  • You can set each of the Chinese, pinyin and definition fields, not just notes
  • The entries do not have to be in CC-CEDICT (in fact, that's the point)
  • Items in the row are appended to the dictionary search database

yomichan is a good example where custom (Japanese) dictionaries can be imported into Chrome's persistent storage.

cschiller commented 4 years ago

I'm sorry, I hope you understand that from a security standpoint adding unsanitized content to the built-in dictionary, be it by reading in a file or reading in form input, is simply unacceptable.

Instead, as mentioned above, I would suggest you submit your entries to CC-CEDICT. This way you can also contribute to the dictionary and other users can benefit from your work.

Thanks!