dask / dask-image

Distributed image processing
http://image.dask.org/en/latest/
BSD 3-Clause "New" or "Revised" License
207 stars 45 forks source link

jp2 slicing #359

Open YangForever opened 3 months ago

YangForever commented 3 months ago

Describe the issue: Hi, I am trying to use dask_image.imread.imread() to speed up the sub-volume extraction on my dataset of .jp2 format, but it throws an error when I extract multiple slices.

Psudocode Example:

import dask_image.imread

def read_subvol_stack(path, file_type):
    img_array_dask = dask_image.imread.imread(f"{path}/*.{file_type}")
    print(img_array_dask)
    img_array = img_array_dask[0:900, :, :].compute()
    return img_array

The print() function gives:

dask.array<_map_read_frame, shape=(3770, 1898, 1898), dtype=uint16, chunksize=(1, 1898, 1898), chunktype=numpy.ndarray>

When the compute() is running, an error occurs:

ValueError: could not broadcast input array from shape (1,1898,1898) into shape (1,1,1898)

I have checked, when compute() function activated, the imread() will read an image in shape of (1,1,1898, 1898), so the chunk size of (1, 1898, 1898) can’t be broadcast.

The code works well for .tif or .png images.

Anything else we need to know?: I replicate the error if it helps: https://github.com/YangForever/DaskImageSlicing/tree/main

Environment:

m-albert commented 2 months ago

@YangForever thanks for reporting this issue 🙏

Actually, there are several known problems with the current dask_image.imread implementation (see https://github.com/dask/dask-image/issues/229) and we recommend considering one of the readers mentioned here.