Bookworm-project / BookwormDB

Tools for text tokenization and encoding
MIT License
84 stars 12 forks source link

Use argparse in OneClick.py #61

Closed bmschmidt closed 8 years ago

bmschmidt commented 9 years ago

Currently, OneClick.py just loads in arguments in order.

That's fine for just building a bookworm in segments (python OneClick.py database_metadata), but as arguments get more complicated, it means that the syntax is both opaque and undocumented. For example, to supplement metadata with a new field keyed to a previous one, the syntax is python OneClick.py supplementMetadataFromJSON newmetadata.json filename, where filename is the key we're joining on. This is hard to remember and undocumented. In the course of documenting it, using argparse would make the OneClick.py codebase a bit longer but more flexible. It would also allow us to pass in certain useful flags, such as changing the logging level as described in Issue 60 or even potentially changing the working directory or tokenization regex.

Those last two are the most compelling reasons to do it for me right now. But ultimately, this might allow us to move the core OneClick.py scripts into a python executable called /usr/lib/bookworm as an executable, and not have to reclone the main repository for every different bookworm built on a server.

bmschmidt commented 9 years ago

Setting up a new branch to track progress on this. I'm going to rename OneClick.py to an executable called "bookworm.py," and use a git-like action-arguments syntax; so that you can run commands like this:

Create the bookworm in the current directory; a wrapper around make.

./bookworm.py build all

Reload memory tables on a particular bookworm:

./bookworm.py --database=federalist reloadMemory

Load in the supplemental metadata at ~/newdata.json to expand the fields available.

./bookworm.py supplement_data --format=json --key=filename --file=~/newdata.json
bmschmidt commented 8 years ago

Closing with v0.4.0