BICCN / cell-locator

manually align specimens to annotated 3D spaces
https://cell-locator.readthedocs.io
Other
19 stars 7 forks source link

Add Conversion script #181

Closed allemangD closed 3 years ago

allemangD commented 3 years ago

This adds a conversion tool convert.py and some test bash scripts to serve as example usage: idem.sh, update.sh, and compare.sh

I've tested this against the files listed in https://github.com/BICCN/cell-locator/pull/160#issuecomment-702512235, https://github.com/BICCN/cell-locator/pull/160#issuecomment-858728096, and https://github.com/BICCN/cell-locator/issues/158#issue-709339736

idem.sh converts each file to its own version. Ideally, this would leave the file unchanged, however some differences are inevitable introduced as there is some information loss. Here are the changes that occur in each version.

All converters add .version and sometimes cause negligible floating-point errors.

You can see all the exact changes on the mentioned files here: https://github.com/allemangD/cell-locator/commit/7bddbed0386005d5318ea6c92be4a239addae31c


All the scripts and classes are documented with docstrings and argparse help text. I've also added some technical documentation and autodocs to the readthedocs; I'd like to publish this branch on readthedocs as a preview but I'm struggling to get the interface to recognize that this branch exists.

The technical documentation and autodocs can be found at: https://cell-locator.readthedocs.io/en/conversion-script/developer_guide/AnnotationFileConverter.html

Here is how the conversion tool help text looks:

$ python convert.py convert -h
usage: convert convert [-h] -v VERSION [-t TARGET] [--no-indent] src dst

positional arguments:
  src                   Source JSON file. Use '-' to read from stdin.
  dst                   Destination JSON file. Use '-' to write to stdout.

optional arguments:
  -h, --help            show this help message and exit
  -v VERSION, --version VERSION
                        Source file version. Use '-v?' to infer the version.
  -t TARGET, --target TARGET
                        Target file version. Defaults to the latest version.
  --no-indent           Do not indent output JSON.

I am not sure the best way to organize the converters. I want to satisfy these criteria

The only reasonable way I've figured to do this is to name the files using the semantic version string for the corresponding converter. However, that means the files are no longer valid Python module names. . and + are forbidden characters, and the name must not start with a digit; semantic versions require all of these.

After some discussion with @jcfr, we decided that this way makes the most sense. Although it is a gross violation of general python style advice, I think the readability improvement is worth it.

Supersedes https://github.com/BICCN/cell-locator/pull/160

jcfr commented 3 years ago

Outstanding :100:

Nitpick: