ome / ome2024-ngff-challenge

Project planning and material repository for the 2024 challenge to generate 1 PB of OME-Zarr data
https://pypi.org/project/ome2024-ngff-challenge/
BSD 3-Clause "New" or "Revised" License
11 stars 8 forks source link

Ro-crate json gets generated in incorrect location based on output path #32

Closed sherwoodf closed 3 weeks ago

sherwoodf commented 3 weeks ago

On macos (unsure if it matters)

steps to reproduce:

Run:

poetry run ome2024-ngff-challenge <path/to/image> folder_0/folder_1/folder_2/OUTPUT-IMG.zarr --rocrate-organism "test_value"

The generated zarr is:

OUTPUT-IMG.zarr/
| 0/
| ...
| zarr.json
| folder_0/
- | folder_1/
- - | folder_2/
- - - | OUTPUT-IMG.zarr/
- - - - | ro-crate-metadata.json

expected out:

OUTPUT-IMG.zarr/
| 0/
| ...
| zarr.json
| ro-crate-metadata.json
joshmoore commented 3 weeks ago

@sherwoodf: I'm having trouble reproducing, can you see what I'm doing differently?

(challenge4) ~/opt/challenge/ome2024-ngff-challenge $ome2024-ngff-
challenge resave --cc0 dev2/input.zarr /tmp/even-
deeper/deeper/output.zarr --output-overwrite --output-chunks=1,1,256,256 --output-shards=1,1,512,512

(challenge4) ~/opt/challenge/ome2024-ngff-challenge $find /tmp/even-deeper/ -name "*.json"
/tmp/even-deeper//deeper/output.zarr/labels/0/0/zarr.json
/tmp/even-deeper//deeper/output.zarr/labels/0/1/zarr.json
/tmp/even-deeper//deeper/output.zarr/labels/0/zarr.json
/tmp/even-deeper//deeper/output.zarr/labels/0/3/zarr.json
/tmp/even-deeper//deeper/output.zarr/labels/0/2/zarr.json
/tmp/even-deeper//deeper/output.zarr/labels/zarr.json
/tmp/even-deeper//deeper/output.zarr/0/zarr.json
/tmp/even-deeper//deeper/output.zarr/1/zarr.json
/tmp/even-deeper//deeper/output.zarr/zarr.json
/tmp/even-deeper//deeper/output.zarr/ro-crate-metadata.json
/tmp/even-deeper//deeper/output.zarr/2/zarr.json

Using poetry

(challenge4) ~/opt/challenge/ome2024-ngff-challenge $mkdir -p /tmp/poetry-prefix

(challenge4) ~/opt/challenge/ome2024-ngff-challenge $poetry run ome2024-ngff-challenge resave --cc0 dev2/input.zarr /tmp/poetry-prefix/deeper/output.zarr --output-overwrite --output-chunks=1,1,256,256 --output-shards=1,1,512,512

(challenge4) ~/opt/challenge/ome2024-ngff-challenge $find /tmp/poetry-prefix/ -name "*.json"
/tmp/poetry-prefix//deeper/output.zarr/labels/0/0/zarr.json
/tmp/poetry-prefix//deeper/output.zarr/labels/0/1/zarr.json
/tmp/poetry-prefix//deeper/output.zarr/labels/0/zarr.json
/tmp/poetry-prefix//deeper/output.zarr/labels/0/3/zarr.json
/tmp/poetry-prefix//deeper/output.zarr/labels/0/2/zarr.json
/tmp/poetry-prefix//deeper/output.zarr/labels/zarr.json
/tmp/poetry-prefix//deeper/output.zarr/0/zarr.json
/tmp/poetry-prefix//deeper/output.zarr/1/zarr.json
/tmp/poetry-prefix//deeper/output.zarr/zarr.json
/tmp/poetry-prefix//deeper/output.zarr/ro-crate-metadata.json
/tmp/poetry-prefix//deeper/output.zarr/2/zarr.json
sherwoodf commented 3 weeks ago

@joshmoore You've provided absolute output paths, whereas mine (and i think the ones in the example in the readme) are relative. I get the correct result if i use e.g. '~/folder_0/outzarr.zarr' but can get an error* when i expressly tried to break everything by outputting to '../outzarr.zarr' (but this is definitely an attempt to break things)

I suspect doing some kind of os.path.abspath(output_path) might solve my issue, but since the script also deals with s3 buckets i'd have to take a longer look to understand how to change it(it's maybe more obvious to you). Otherwise some note / update to examples might be simpler.

*The error message:

  File "/Users/fsherwood/workspace/ngff/ome2024-ngff-challenge/src/ome2024_ngff_challenge/resave.py", line 422, in convert_array
    write = ts.open(write_config).result()
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: Error opening "zarr3" driver: Invalid key: "../outzarr.zarr/0" [source locations='tensorstore/kvstore/file/file_key_value_store.cc:810\ntensorstore/driver/driver.cc:112'] [tensorstore_spec='{\"context\":{\"cache_pool\":{},\"data_copy_concurrency\":{},\"file_io_concurrency\":{},\"file_io_sync\":true},\"create\":true,\"delete_existing\":true,\"driver\":\"zarr3\",\"dtype\":\"uint8\",\"kvstore\":{\"driver\":\"file\",\"path\":\"../outzarr.zarr/0/\"},\"metadata\":{\"chunk_grid\":{\"configuration\":{\"chunk_shape\":[1,2,57,512,1024]},\"name\":\"regular\"},\"chunk_key_encoding\":{\"name\":\"default\"},\"codecs\":[{\"configuration\":{\"chunk_shape\":[1,1,1,512,1024],\"codecs\":[{\"configuration\":{\"endian\":\"little\"},\"name\":\"bytes\"},{\"configuration\":{\"clevel\":5,\"cname\":\"zstd\"},\"name\":\"blosc\"}],\"index_codecs\":[{\"configuration\":{\"endian\":\"little\"},\"name\":\"bytes\"},{\"name\":\"crc32c\"}],\"index_location\":\"end\"},\"name\":\"sharding_indexed\"}],\"data_type\":\"uint8\",\"dimension_names\":[\"t\",\"c\",\"z\",\"y\",\"x\"],\"node_type\":\"array\",\"shape\":[1,2,57,512,1024]},\"transform\":{\"input_exclusive_max\":[[1],[2],[57],[512],[1024]],\"input_inclusive_min\":[0,0,0,0,0],\"input_labels\":[\"t\",\"c\",\"z\",\"y\",\"x\"]}}']
joshmoore commented 3 weeks ago

Gotcha! Thanks. Fix pushed in 24077e60cc887772ffc3b105a93caec670f8a94e