sanskrit-lexicon / csl-json

Implementation of Json format of Cologne dictionary data.
0 stars 2 forks source link

Too much data crashes safari #8

Closed drdhaval2785 closed 2 years ago

drdhaval2785 commented 2 years ago

I used to see blank screens in the ashtadhyayi app in iphone. It boils down to this.

QUOTE It turns out that the Kosha is fetching a significantly high amount of data (of the order of 400 MB) and populating it in the local cache of the browser. While this is OK with chrome, it is causing problems with Safari (It is crossing safari's internal cache limit).

The reason why this problem started coming recently is because now each result has additional info of "Correction Submission" followed by a link and other stuff. I did some quick back-of-the-envelope calculations, and realized that for Vachaspatyam alone this extra info adds to 15 MBs. I am fetching a total of 15 dictionaries from cologne. Assuming an average of 10 MB extra data for "correction submission" link, this still adds to 150 MB or so. (This does not include the "scan page", so the actual number will be even bigger). This means that without the extra "correction submission" data, we will reduce the storage by 35%.

Given that this extra data is highly boilerplate, is it possible for you to remove this data from the response, and instead send me info on how to construct this data on my side? That will reduce the size significantly. UNQUOTE

drdhaval2785 commented 2 years ago

Currently the text data of SKD is shown as below.

            "14": "अंशुधरः, पुं, (अंशूनां धरः । धरति इति धरः ॥ <BR>पचाद्यच् ।) सूर्य्यः ॥ इति त्रिकाण्डशेषः ॥ <BR><a href=\"[https://www.sanskrit-lexicon.uni-koeln.de/scans/csl-apidev/servepdf.php?dict=SKD&page=1-001\](https://www.sanskrit-lexicon.uni-koeln.de/scans/csl-apidev/servepdf.php?dict=SKD&page=1-001%5C)" target=\"_blank\">Scan page : 1-001</a> <BR><a href=\"[https://github.com/sanskrit-lexicon/csl-ldev/blob/main/v02/skd/14.txt\](https://github.com/sanskrit-lexicon/csl-ldev/blob/main/v02/skd/14.txt%5C)" target=\"_blank\">Correction submission : aMSuDaraH, 14</a>"

I propose to show the result in the following format.

            "243": [
"अंशुधरः, पुं, (अंशूनां धरः । धरति इति धरः ॥ <BR>पचाद्यच् ।) सूर्य्यः ॥ इति त्रिकाण्डशेषः ॥",
1-001,
14
]

definition, pagenumber (pc) and dictionaryentry number (lnum) in a list.

You can easily construct from page number the following item. <a href=\"[https://www.sanskrit-lexicon.uni-koeln.de/scans/csl-apidev/servepdf.php?dict=SKD&page=1-001\](https://www.sanskrit-lexicon.uni-koeln.de/scans/csl-apidev/servepdf.php?dict=SKD&page=1-001%5C)" target=\"_blank\">Scan page : 1-001</a>

And from lnum the following item. <a href=\"[https://github.com/sanskrit-lexicon/csl-ldev/blob/main/v02/skd/14.txt\](https://github.com/sanskrit-lexicon/csl-ldev/blob/main/v02/skd/14.txt%5C)" target=\"_blank\">Correction submission : aMSuDaraH, 14</a>

gasyoun commented 2 years ago

I propose to show the result in the following format.

Makes sense, 150 is a lot on a phone.

drdhaval2785 commented 2 years ago

https://github.com/sanskrit-lexicon/csl-json/commit/57548f8a80d997130bc9eeea9ad90dc6783d0730 closes this. Now the data is in [text, pc, lnum] format as shown in https://github.com/sanskrit-lexicon/csl-json/issues/8#issuecomment-1037961308.