saalfeldlab / n5-utils

simple standalone BigDataViewer for multiple N5 (or HDF5) datasets
Other
8 stars 16 forks source link

n5-copy copies long attributes as double attributes #16

Closed hanslovsky closed 3 years ago

hanslovsky commented 4 years ago

Steps to reproduce:

$ mkdir -p test.n5/bla
$ echo '{"blockSize": [1], "dimensions": [1],"dataType":"uint8","compression":{"type":"gzip","level":-1},"maxId":1}' > test.n5/bla/attributes.json
$ n5-copy -i test.n5  -o test.h5 -d /bla -b 1 -c gzip
/bla
  attributes:
    maxId : double
    skipping dataset attribute dataType : class java.lang.String
    skipping dataset attribute compression : class java.lang.Object
    skipping dataset attribute blockSize : class [D
    skipping dataset attribute dimensions : class [D
$ h5ls -v test.h5/bla
Opened "test.h5" with sec2 driver.
bla                      Dataset {1/Inf}
    Attribute: maxId scalar
        Type:      native double
        Data:  1
    Location:  1:2344
    Links:     1
    Chunks:    {1} 1 bytes
    Storage:   1 logical byte, 16 allocated bytes, 6.25% utilization
    Filter-0:  scaleoffset-6 OPT {2, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
    Filter-1:  deflate-1 OPT {6}
    Type:      native unsigned char

See Copy.copyAttributes

axtimwalde commented 3 years ago

This is a side effect of JSON's limited type system where floating point and integer numbers are the same. I have tried to remedy this by value based inference with https://github.com/saalfeldlab/n5/commit/94c579a23f275eebcff24f8fe5da7741e5b444bc but that will lead to integer numbers always coming out as long. For the JSON based backends (Zarr, N5), this is not relevant, but it may confuse in HDF5. I am considering String based reasoning such that at it's possible to enforce double numbers by including a decimal point. A follow up release of N5 will make this situation better and this project will consume this release, so I am closing the issue in anticipation.