elephant-track / elephant-server

A server implementation of ELEPHANT
BSD 2-Clause "Simplified" License
7 stars 5 forks source link

Question about `dataset/imgs.zarr`. #8

Open JoOkuma opened 3 years ago

JoOkuma commented 3 years ago

Hi, Is there any restriction/standard for the datasetimgs.zarr` format (dtype, channel order, chunksize, etc.)?

I'm currently converting my data from zarr to the BigDataViewer format and I would like to skip the step of converting back to zarr for the machine learning and use my original data insted.

Does it work with any zarr array with axis T, Z, Y, X, and integer or floating values?

ksugar commented 3 years ago

Yes, ELEPHANT expects img.zarr to have a specific channel order and shape (T, Z, Y, X), while its dtype and chunksize can be flexible. https://github.com/elephant-track/elephant-server/blob/v0.2.0/elephant-core/elephant/tool/dataset.py By default, ELPHANT creates imgs.zarr from .h5 with a dtype of uint8 or uint16 depending on the original data, and a shape of (T, Z, Y, X). If you manually prepare imgs.zarr, please additionally prepare other .zarr files (see below) for ELEPHANT with the format specified in the table. The existances of these files and its dtype and shape are checked before each command.

dataset
    ├── flow_hashes.zarr
    ├── flow_labels.zarr
    ├── flow_outputs.zarr
    ├── imgs.zarr
    ├── seg_labels_vis.zarr
    ├── seg_labels.zarr
    └── seg_outputs.zarr
file dtype shape
flow_hashes.zarr S16 (T - 1,)
flow_labels.zarr f4 (T - 1, 4, Z, Y, X)
flow_outputs.zarr f2 (T - 1, 3, Z, Y, X)
seg_labels_vis.zarr u1 (T, Z, Y, X, 3)
seg_labels.zarr u1 (T, Z, Y, X)
seg_outputs.zarr f2 (T, Z, Y, X, 3)
Notes about uint8 or uint16

The BigDataViewer .h5 files store image data using uint16. If the maximum value in the image data is smaller than 256, we use uint8 to save the storage, otherwise we use uint16. At runtime, image data stored in img.zarr is converted to float32 and normalized in the range [0, 1].