mhowison / codebooks

Automatic generation of codebooks from dataframes.
BSD 3-Clause "New" or "Revised" License
1 stars 4 forks source link

Codebooks

Automatically generate codebooks from dataframes. Includes methods to:

Usage:

codebooks -o output.html input.csv

Adding variable descriptions

You can specify a csv file that maps variable names to descriptions using:

codebooks --desc descriptions.csv -o output.html input.csv

The csv file is expected to have two columns (variable, description).

License

3-Clause BSD (see LICENSE)

Tests

The test/ subdirectory contains a script to generate a synthetic data set, an integration test for the codebooks package, and a benchmark script used to test performance optimizations. You can run these with:

cd test
python dataset.py
codebooks --desc desc.csv dataset.csv
codebooks --desc desc.csv --parquet dataset.parquet
python benchmark.py

Authors

Mark Howison
http://mark.howison.org