KELLIA / dictionary

The dictionary comprised of the Coptic lexicon created by the BBAW and interface by Coptic SCRIPTORIUM. Currently deployed at https://coptic-dictionary.org
28 stars 12 forks source link

`phrase_freqs.tab` not found #216

Open yoshiask opened 11 months ago

yoshiask commented 11 months ago

Describe the bug The dictionary_reader.py utility attempts to load a file named phrase_freqs.tab, which does not exist in the repo (or anywhere else as far as I can tell).

To Reproduce Steps to reproduce the behavior:

  1. Clone repository
  2. Run dictionary_reader.py from any directory
  3. See error:
    o Reading ./xml/Comprehensive_Coptic_Lexicon-v1.2-2020.xml
    Traceback (most recent call last):
    File "/mnt/c/Users/jjask/source/repos/KELLIA/dictionary/./utils/dictionary_reader.py", line 612, in <module>
    data = io.open("phrase_freqs.tab", encoding="utf8").read().strip().split("\n")
    FileNotFoundError: [Errno 2] No such file or directory: 'phrase_freqs.tab'

Expected behavior The phrase_freqs.tab file should be provided in the repo, or if this is not possible, the dependency should be documented and made optional.

Additional context I'm only trying to use the utility because parsing the XML source at the start of my program is too costly, and the checked-in [alpha_kyima_rc1.db](https://github.com/KELLIA/dictionary/blob/master/alpha_kyima_rc1.db) is four years out of date. It might be worth updating the DB if you can't provide the missing file for the utility. I can open a separate issue for this if requested.

yoshiask commented 11 months ago

Just realized it's available in the dev branch. This issue can be closed.

amir-zeldes commented 11 months ago

Thanks for reporting - we should probably put out a new stable release and pull dev to master. I'll leave this open as a reminder and hope we can find some time to do that before long.