Curation tools for Grambank data.
pygrambank
can be installed from PyPI via
pip install pygrambank
or from a clone of [grambank/pygrambank
]:
git clone ...
cd pygrambank
pip install -e .
You should install pygrambank
in a virtual environment to make sure it does not mess with a system-wide Python installation.
Installing pygrambank
will also install a command line program grambank
. Data curation functionality is implemented as subcommands
of this program. To get information about available subcommands, run
grambank --help
More info on individual subcommands can be obtained running
grambank <SUBCOMMAND> -h
e.g.
$ grambank describe -h
usage: grambank describe [-h] [--columns] SHEET
Describe a (set of) sheets.
This includes checking for correctness - i.e. the functionality of `grambank check`.
While references will be parsed, the corresponding sources will **not** be looked up
in Glottolog (since this is slow). Thus, for a final check of a sheet, you must run
`grambank sourcelookup`.
positional arguments:
SHEET Path of a specific TSV file to check or substring of a filename
(e.g. a glottocode)
optional arguments:
-h, --help show this help message and exit
--columns List columns of the sheet (default: False)
For ´describeand ´sourcelookup
at ELDP-glottobank, it is necessary that you run the commands from the dir ELDP-glottobank
, otherwise the filepaths to gb20.txt, gb.bib, contributors etc will not work.
e.g.
[2024-05-20 10:45:36] skirgard@lingn06 /Users/skirgard/Git/glottobank/ELDP-glottobank
> grambank describe grambank/original_sheets/FCE_apal1257.tsv
[2024-05-20 10:45:36] skirgard@lingn06 /Users/skirgard/Git/glottobank/ELDP-glottobank
> grambank sourcelookup ../../glottolog/glottolog grambank/original_sheets/FCE_apal1257.tsv
pygrambank
also allows programmatic access to Grambank data from Python
programs. All functionality is mediated through a pygrambank.Grambank
instance:
>>> from pygrambank import Grambank
>>> gb = Grambank('.')
>>> gb.sheets_dir
PosixPath('original_sheets')
>>> for sheet in gb.iter_sheets():
... print(sheet)
... break
...
original_sheets/AH_alag1248.tsv