OpenPecha / Botok

🏷 བོད་ཏོག [pʰøtɔk̚] Tibetan word tokenizer in Python
https://botok.readthedocs.io/
Apache License 2.0
58 stars 15 forks source link

Check existence of the latest resource files before downloading #69

Closed BLKSerene closed 2 years ago

BLKSerene commented 4 years ago

botok tries downloading resource files from Github every time when imported (the existence checking only happens after the downloading process succeeds or fails).

I suppose that it would be better to test the existence of the latest version (some version file might be needed) of resource files before downloading instead of after it, as users still have to wait for 50 seconds (the current timeout settings) if they already have downloaded the latest resource files before and there're some problems with connections to Github.

OS: Windows 10 x64 Python: 3.7.7 x64 botok: 0.7.5

ngawangtrinley commented 4 years ago

Thanks for the feedback, we'll look into that tomorrow.

BLKSerene commented 4 years ago

I've just found that this issue has been resolved in botok 0.8.1, but it would be better for me to put the downloaded files under site-packages/botok/ instead of Users/Username/Documents/ so that the required resources for botok to run could be conveniently collected by PyInstaller when the program using botok is frozen into an executable, though I know how to change the download path in botok 0.8.1 and it is quite easy to do so.