digitalpalidictionary / dpd-db

12 stars 7 forks source link

Digital Pāḷi Database

Using the DB

  1. Clone this repo
  2. Download dpd.db.tar.bz2 from this page,
  3. Unzip and place it in the root of the project folder
  4. Install poetry
  5. In the terminal, run poetry install
  6. See scripts/db_search_example.py for a quick tutorial on how to use the database with SQLAlchemy

Code Structure

There are four main parts to the code:

  1. Create the database and build up the tables of derived data.
  2. Add new words, edit and update the db with a GUI.
  3. Run data integrity tests on the db.
  4. Compile all the parts and export into various dictionary formats.

About the database

Building the DB

  1. Download this repo
  2. Get tipitaka-xml with git submodule init && git submodule update commands
  3. Install nodejs
  4. Install poetry
  5. poetry install
  6. poetry run bash scripts/bash/initial_setup_run_once.sh
  7. poetry run bash scripts/bash/build_db.sh
  8. To be able to run database tests you may need to install some of these packages.

That should create an SQLite database ./dpd.db which can be accessed by DB Browser, DBeaver, through SQLAlechmy or your preferred method.

For a quick tutorial on how to access any information in the db with SQLAlchemy, see scripts/db_search_example.py.

Build a complete database locally and extract all dictionaries

⚠️ WARNING: When db/deconstructor/sandhi_splitter.py runs with the config option deconstructor.all_texts = yes, it will take several hours to complete.

Starting with a fresh clone of the tip:

git clone --depth=1 https://github.com/digitalpalidictionary/dpd-db.git
cd dpd-db
git submodule init && git submodule update
poetry install
poetry run bash scripts/bash/build_and_make_all.sh

This creates the dpd.db SQLite database. Also it extract all dictionaries see folder exporter/share

Additional configuration

  1. install dictzip, link for Linux
  2. install tkinter, link for Linux