signebedi / gita-api

a RESTful Bhagavad Gita API
GNU Affero General Public License v3.0
0 stars 0 forks source link

Add support for text-search REGEX overrides #96

Open signebedi opened 9 months ago

signebedi commented 9 months ago

Add support for text-search REGEX overrides A major constraint for this application is its tight coupling to the three methods of searching text by reference (1, 1.1, and 1.1-4). This is also based on an assumption that all texts will conform to chapter.verse reference system. This is admittedly a flawed assumption out of the gate. The question then becomes: do we simply exclude non-conforming texts (boo!), do we force them to conform to the existing structure (this is what I did with the corpus Caesarianum, which follows a TEXT_NAME:str BOOK.SECTION.SUB-SECTION format - which we navigated by making each text its own corpus... though that its technically not always desirable, esp. if we want to search all texts within a broader corpus.

Originally posted by @signebedi in https://github.com/signebedi/gita-api/issues/35#issuecomment-1925830506

signebedi commented 9 months ago

My sense is that we can probably add the overrides as an option in corpora.json. I am not sure yet whether we want to define pre-set corpus design patterns, or whether we want to simply let users define their own. Probably both, though it might make sense to implement the latter option first, and then to grandfather well-written overrides into presets within the gita/ library and add a relevant test suite. See below for an example application structure.

.
├── gita
│   └── __init__.py
│   └── presets.py # <<<< Add this here
├── tests
│   ├── __init__.py
│   ├── __main__.py
│   ├── test_flask_app.py
│   └── test_gita_api.py
│   └── test_gita_presets.py # <<<< Add this here