nikita-moor / latin-dictionary

Latin language dictionaries
https://nikita-moor.github.io/dictionaries/
34 stars 6 forks source link

Dictionary formats #2

Open nikita-moor opened 5 years ago

nikita-moor commented 5 years ago

Main dictionary format of the project is XDXF. GoldenDict is a good choice for desktop users, but mobile users has no good application.

Therefore, I want to know which dictionary application (Android or iOS) you use/prefer and what supplementary formats would be useful to produce for use on mobile devices?

Related discussion on XDXF project.

Multi-format dictionary shells:

StarDict ABBYY Dictd
Alpus ✓¹
GoldenDict Mobile
EBPocket Free ✓¹
Dictan ✓¹
Linguae (desktop)

¹ only source files (DSL)

Dictionary formats

ToDo

nikita-moor commented 5 years ago

Embedding images: StarDict

Support of images in different variants of StarDict:

r h x
Goldendict (desktop)
Goldendict Mobile (Android) ✗*
Alpus (Android, iOS, desktop)
ColorDict (Android)
Twinkle Star Dictionary (Android)
EBPocket Free (Android, iOS, desktop)
Dicty (iOS)
QDict (Android) abandoned
Dict Box (Android, iOS)
Stardict Dictionary for PC (Android)
WordMateX (Android)
DusalDict (Android)

Formats:

Notes:

Testing boundle: stardict-test-img.zip

Could import StarDict files:

nikita-moor commented 5 years ago

Embedding images: ABBYY DSL

This format is most popular in Russia(?), so not many applications support it. Official mobile client ABBYY Lingvo Dictionaries does not allow adding custom dictionaries. However, it contains free Latin-Russian dictionary…

application images
Goldendict Mobile
Alpus

Notes:

nikita-moor commented 5 years ago

Embedding images: general

Even when some applications could show images referenced in the dictionary articles, all of them do it directly in the application window. As a result, full-page scans are either too small to read or too big to fit the screen; no application provides comfortable zoom/navigation.

It depends on the device size, so what is good for tablet may be inconvenient for smartphone. I would prefer an option of switching between full image and icon-size, so the image be open in an external image viewer. For illustration (here Alpus did not recognize two pages in TIFF CCITT G4 format):

icon-preview

full-preview

nikita-moor commented 5 years ago

Format MDict

File format v2.0; images are stored in MDD file and referenced as <img src="picture.png"/>.

application images
GoldenDict (desktop)
MDict (Android, iOS, desktop)
Eudic/欧路词典 (Android, iOS, desktop)
BlueDict (Android)
Plain Dictionary (Android)
EBPocket Free (Android, iOS, desktop)
Medict (desktop) ?
SkyDic (Windows Phone) ?

Formatting: manual compilation, python-writemdict.

Per app shortages:

Comments

MDict format is very pleasant; ability to include custom CSS styles and JS libraries is unique and very powerful. Dictionary applications are alive and actively developed. Morphology search is supported in MDict (Hunspell) and BlueDict (separate dictionary; probably applicable to other shells).

MDict is a commercial closed format. Wang Xiaoqiang and @zhansliu analyzed versions 1.2 and 2.0, most of the libraries for Python, Java, JavaScript, etc. are based on their description of the format. Does third-party dictionary shells support current versions 4.0?

nikita-moor commented 5 years ago

Format: Slob

Slob is another perspective format. It supports including images, CSS styles and JavaScript code in one file.

Dictionary shell Aard 2 for Android is open source and does not apply limitations on use (such as no more than 5 dictionaries in Free version); GoldenDict supports Slob format. There are extensive Python libraries and tools.

Disadvantages

  1. Slob is a container, text-content coding is not standardized and expected to be plain text or HTML.
  2. Slob is supported only by Aard 2 (mobile) and GoldenDict (desktop). Having dictionaries in other formats (StarDict, DSL, MDict), users would be obligated to work with two shells simultaneously.
  3. Binary format.
  4. Inter-dictionary links are not supported.

Issues

ToDo

Conclusion

Slob has all features I like in MDict. It is an open format, but MDict is more popular and better supported by dictionary shells.

michaelbeijer commented 4 years ago

Does third-party dictionary shells support current versions 4.0?

See my wiki: http://beijer.wiki/mdxbuilder-manual_eng.txt No, not that I know of. To convert MDict files (in .mdx format) for use in GoldenDict, you need to use MDXBuilder. The latest version of MDXBuilder (available on MDict website) does not (yet) generate files that can be handled by GoldenDict. For this to work you need an older version of MDXBuilder. I managed to find one online (from hi-pda.com), and have uploaded it to http://beijer.wiki/storage/MdxBuilder-(downloaded-from-hi-pda.com).zip in case you're looking for a copy.

Michael Beijer (technical translator, beijer.uk/beijer.wiki)

nikita-moor commented 2 years ago

Use discussions, please, for further talks. This page is intended to be a place for documentation.

nikita-moor commented 2 years ago

Format: DICT

DICT is a dictionary network protocol created in 1997 (it can work locally on the user's computer). Articles can be provided as plain text or HTML (or any other format with appropriate MIME header). All servers, particularly Dictd and Dico, support MIME option. Also, some dictionary shells, such as GoldenDict, can read files in DICT format directly.

Clients

Plain text HTML
GoldenDict desktop ✓¹
Lingoes ? ?
GNOME Dictionary ✗²
xfce4-dict
GoldenDict Mobile ✗¹
Fora Dictionary / Alpus ✗¹

¹ can read DICT files directly ² not implemented

Example

Conclusion

There were many clients in the past, but now we have only GoldenDict. However, it can be a good way to make an online dictionary.