A package for maintaining robust, reproducible data management.
Science relies on repeatable results. rushd
is a Python package that helps with this, both by making sure that the execution context (e.g. the state of all of the Pip packages) is saved, alongside helper functions that help you cleanly, but repeatedly, separate data from code.
This package is on Pip, so you can just:
pip install rushd
Alternatively, you can get built wheels from the Releases tab on Github.
Simply import rushd
!
import rushd as rd
See the documentation available at https://gallowaylabmit.github.io/rushd
If you'd like to hack locally on rushd
, after cloning this repository:
$ git clone https://github.com/GallowayLabMIT/rushd.git
$ cd rushd
you can create a local virtual environment, and install rushd
in "development (editable) mode"
with the extra requirements for tests.
$ python -m venv env
$ .\env\Scripts\activate (on Windows)
$ source env/bin/activate (on Mac/Linux)
$ pip install -e .[dev] (on most shells)
$ pip install -e '.[dev]' (on zsh)
After this 'local install', you can use and import rushd
freely without
having to re-install after each update.
We use something called pre-commit to automatically run linters, formatters, and other checks to make sure the code stays high quality.
After doing the developer install and activating the virtual environment, you should run:
$ pre-commit install
to install the git hooks. Now, pre-commit will automatically run whenever you go to commit.
We use pytest to test our code. You just type:
$ pytest
to run all tests, though you can add an optional argument to run some subset of the tests:
$ pytest tests/test_file_io.py
Pytest automatically discovers tests put in the tests
directory, whose files and functions
start with the word test
.
On every push, all of the tests are run and the coverage, or which lines are "covered" or executed during all tests, is calculated and uploaded to Codecov. This is a nice way of seeing if you missed any edge cases that need tests added.
See the CHANGELOG for detailed changes.
## [0.5.0] - 2024-04-15
### Added
- Added new `rd.plot.debug_axes` which draws guide lines to help with axis alignment.
- Added new `rd.plot.adjust_subplot_margins_inches` which allows subplot configuring
using inch offsets (instead of subfigure coordinate offsets)
### Modified
- `rd.flow.load_csv_with_metadata` and
`rd.flow.load_groups_with_metadata` can now load a subset of columns.
- The `datadir.txt` can include paths that use `~` to represent the home directory.
- `rd.plot.generate_xticklabels` does not include metadata key labels in plots without yticklabels
- `rd.plot.generate_xticklabels` no longer throws an error when xticklabels don't match the dictionary passed (instead leaves labels as-is)
- `rd.plot.generate_xticklabels` now enables user-specified line spacing
This is licensed by the MIT license. Use freely!
The name is a reference to Ibn Rushd, a Muslim scholar born in Córdoba who was responsible for translating and adding scholastic commentary to ancient Greek works, especially Aristotle. His translations spurred further translations into Latin and Hebrew, reigniting interest in ancient Greek works for the first time since the fall of the Roman empire.
His name is pronounced rush-id.
If we take the first and last letter, we also get rd
: repeatable data!