sanskrit-lexicon / COLOGNE

Development of http://www.sanskrit-lexicon.uni-koeln.de/
18 stars 3 forks source link

Integrate Cologne dictionaries with stardict-updater #98

Closed drdhaval2785 closed 3 years ago

drdhaval2785 commented 7 years ago

It is a bit ambitious. It would be great if we can convert all dictionaries (of course in descending order of importance) to babylon format. We will then be able to use them as stardict files. I personally use it on mobile on colordict and am deeply in love with the ease. https://github.com/sanskrit-coders/stardict-sanskrit/

Babylon format is simple enough format.

Line 1 - headwords separated by |
Line 2 - Dictionary entry
Line 3 - line break

Currently stardict-dictionary-updater project uses some of the dictionary data from Cologne (scraped maybe some years back). This means their users are not benefiting from corrections being made day in and day out at Cologne servers.

If we integrate seamlessly, both will benefit.

  1. Dictionaries will have wider circulation (also on android / mobiles etc)
  2. Dictionaries will be better updated.

@gasyoun and @funderburkjim What do you think about this?

Not much of work compared to the benifit.

vvasuki commented 7 years ago

They were uploaded on 17-Apr-2017 and till 23-Apr-2017, the download counts are as below

I am honestly surprised as well. There aren't that many active "updater" installs, and I did not expect users to have run the updater app so soon (without any announcement from our end too) - image

It is unlikely that these dicts were installed independently of the updater app (remember - no announcement done). This of course, is totally separate from:

drdhaval2785 commented 7 years ago

The stats show that roughly 20+ downloads were done of 'whole sa-head' type. Default selected dicts were all downloaded. Why on earth would anyone download bopp independently?

drdhaval2785 commented 7 years ago

(without any announcement from our end too) -

I intend to defer announcement till licence issue is settled to the satisfaction of @funderburkjim. See https://github.com/sanskrit-lexicon/cologne-stardict/issues/1.

vvasuki commented 7 years ago

I intend to defer announcement till licence issue is settled to the satisfaction of @funderburkjim. See sanskrit-lexicon/cologne-stardict#1.

No hurry at all. Take your time.

Why on earth would anyone download bopp independently?

Because it is not clear from the filename that they're getting a latin dictionary! (which is what I alluded to in https://github.com/sanskrit-coders/stardict-sanskrit/issues/37 ) may be they think it is their baap's dict :-)

gasyoun commented 7 years ago

baap's dict

You made my day.

gasyoun commented 7 years ago

See the offline dictionary I use on Win 7: https://sourceforge.net/projects/sandic/files/stats/timeline?dates=2012-03-01+to+2017-04-24

5years-10k-downloads

vvasuki commented 7 years ago

That's quite some impact! I observe a mild downward trend. It's interesting to note so many non-Indian users (I suspect most of the US downloads are by Indians, though) - Russia, Korea and Ukraine are a surprise to me in particular.

image

funderburkjim commented 6 years ago

@drdhaval2785

Peter Scharf asked me: Does Stardict and the downloader work on Mac iphone?

Do we have any information on that?

vvasuki commented 6 years ago

Stardict works via the paid https://itunes.apple.com/us/app/dictionary-universal/id312088272?mt=8 app ( http://old.aupasana.com/software/stardict has screenshots). Haven't found a free alternative ( https://groups.google.com/forum/#!topic/sanskrit-programmers/b5dfwgOEcls ).

The downloader does not work on iphone - only on android.

funderburkjim commented 6 years ago

The actual dictionary files (from Cologne made by Dhaval) are the xx.babylon files here. Right?

So if one could get these files onto iphone, then the 'Dictionary Universal' app would display them.

Is there an ad hoc way to get such a file on an IOS device to test the compatibility of the DU app with the files?

funderburkjim commented 6 years ago

Another application possibility is to use the stardict format files along with some stardict viewer as a way to get standalone versions of the dictionaries for Win, MAC-OS, Linux. While we have a way to get such versions of the dictionaries on these OS, that way depends on first getting a local PHP server. This step is too complicated for many people who would otherwise like to have a version of one or more of the Cologne dictionaries available locally on their laptop or desktop computer.

vvasuki commented 6 years ago

Another application possibility is to use the stardict format files along with some stardict viewer as a way to get standalone versions of the dictionaries for Win, MAC-OS, Linux. While we have a way to get such versions of the dictionaries on these OS, that way depends on first getting a local PHP server.

This is a fully solved problem. I refer to a hundred plus stardict dictionaries (in one shot) on my linux computer using the goldendict app. I hear that windows and mac computer users are similarly endowed. No need for further efforts here.

The actual dictionary files (from Cologne made by Dhaval) are the xx.babylon files here. Right?

Yes.

So if one could get these files onto iphone, then the 'Dictionary Universal' app would display them. Is there an ad hoc way to get such a file on an IOS device to test the compatibility of the DU app with the files?

We know that the stardict files (basically tar.gz urls listed here) generated from those initial babylon files are already fully compatible with 'Dictionary Universal' app - nothing to test anew.. And, it is quite simple to get any dictionary one desires one at a time as explained in http://old.aupasana.com/software/stardict .

gasyoun commented 6 years ago

New OCR of MD dictionary https://raw.githubusercontent.com/novikovag/OCR/master/Macdonell%20A.A.%20Sanskrit-English%20Dictionary/html/data.txt.html - for comparison.

drdhaval2785 commented 3 years ago

Stardicts were generated and is being regenreated on and off. Safe to close. Cronjob is misaing, but tracked separately.