hrishikeshrt / PyCDSL

Python Interface to Cologne Digital Sanskrit Lexicon (CDSL)
https://pypi.org/project/PyCDSL/
Other
12 stars 1 forks source link

Allow download of specified dictionaries during setup. #2

Closed drdhaval2785 closed 2 years ago

drdhaval2785 commented 2 years ago

Description

Allow user to download only selected dictionaries. Currently it seems that the user is forced to download all the dictionaries when he presses CDSL.setup(). Sometimes there are users who only want to download specific dictionaries, and not download unwanted dictionaries to maintain their system clutter free.

What I Did

Python 3.6.9 (default, Dec  8 2021, 21:08:43) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pycdsl
>>> CDSL = pycdsl.CDSLCorpus()
>>> CDSL.setup()
100%|███████████████████████████████████████████████████████████| 10.1M/10.1M [00:06<00:00, 1.56MB/s]
100%|███████████████████████████████████████████████████████████| 36.7M/36.7M [00:20<00:00, 1.77MB/s]
100%|███████████████████████████████████████████████████████████| 7.90M/7.90M [00:04<00:00, 1.67MB/s]
100%|███████████████████████████████████████████████████████████| 4.49M/4.49M [00:02<00:00, 1.66MB/s]
True
hrishikeshrt commented 2 years ago

Currently, by default four dictionaries are downloaded -- "MW", "MWE", "AP90" and "AE".

Specific dictionaries can be installed by calling CDSL.setup() with specific arguments. For example,

import pycdsl
CDSL = pycdsl.CDSLCorpus()
CDSL.setup(["VCP", "SKD"])

The question of whether there should be any default dictionaries or not can be thought of separately. I do agree on the clutter part. I had just thought the most used dictionaries are Monier-Williams' and Apte's, in their Sanskrit-English and English-Sanskrit form.

drdhaval2785 commented 2 years ago

If you can write in the docs that setup can take list of dictionaries as arguments, this issue can be closed.

hrishikeshrt commented 2 years ago

Updated the documentation of CDSLCorpus.setup() function in commit 7c58cf4e5161c83f76e4552fb18767ad09fc1ec4 fixing this issue. The documentation on readthedocs.io will be updated with this commit. The changes will appear in the package next release.

drdhaval2785 commented 2 years ago

My request would be to update the docs at https://pycdsl.readthedocs.io/en/latest/readme.html#usage too.