AllenCellModeling / aicspylibczi

Python module utilizing libCZI for reading Zeiss CZI files.
https://allencellmodeling.github.io/aicspylibczi
GNU General Public License v3.0
36 stars 7 forks source link

reading in raw lightsheet data (CZI file) #81

Closed pr4deepr closed 3 years ago

pr4deepr commented 3 years ago

System and Software

Description

I am trying open a czi file which is raw lightsheet data, i.e., not deskewed. The deskewed data as a czi file opens fine, but the raw data (not deskewed) throws an error. The idea is to read in the raw data and perform deskewing and deconvolution in Python.

Expected Behavior

Expected it to return the czi file as a dask array

Reproduction

This is just an example code for troubleshooting. I was initially using aicsimageio directly using imread_dask and was getting the same error

from aicsimageio.readers import czi_reader
from aicspylibczi import CziFile
img='D://Pradeep//Lightsheet//skew_deskew_example/image.czi'
czi_deskew = CziFile(img)
czi_reader.CziReader._daread(img,czi_deskew)

It throws an error: *

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-2-ba37ba0ae147> in <module>
      1 # Read first plane for information used by dask.array.from_delayed
----> 2 sample, sample_dims = czi.read_image(**first_plane_read_dims)
      3 print(sample_dims)

~\AppData\Local\Continuum\anaconda3\envs\lightsheet\lib\site-packages\aicspylibczi\CziFile.py in read_image(self, **kwargs)
    386         #print(cores)
    387         #print(plane_constraints)
--> 388         image, shape = self.reader.read_selected(plane_constraints, m_index, cores)
    389         #print(shape)
    390         #print(image)

RuntimeError: The method or operation is not implemented.

I am not sure what this error means.

I have deskewed data generated from another source on the same data and it works really well gibing the output:
(dask.array<concatenate, shape=(119, 3, 75, 1166, 1488), dtype=uint16, chunksize=(1, 1, 75, 1166, 1488), chunktype=numpy.ndarray>,
 'TCZYX')

The dimensions of the raw data are: (119, 3, 751, 150, 1488) in TCZYX format. The dimensions of the deskewed data that works are: (119, 3, 75, 1166, 1488) in TCZYX format.

Environment

Anaconda Environment

Thanks Pradeep

heeler commented 3 years ago

Hi @pr4deepr,

I'll look into this as soon as I have work on the 3.0 release completed. It would be ideal to have a test file from your system that has the problematic behavior if at all possible. It is somewhat likely that the raw/skewed data is not supported by libCZI. I might be able to patch libCZI to make that work but that's an unknown. If you can get me a small test file that would be fantastic. I'll hope to take a look at it within a week.

Thanks @heeler

pr4deepr commented 3 years ago

Hi Thank you for getting back to me. I'll see if I can get a sample image dataset early next week.

Cheers Pradeep

On Thu, Apr 29, 2021, 00:38 Jamie Sherman @.***> wrote:

Hi @pr4deepr https://github.com/pr4deepr,

I'll look into this as soon as I have work on the 3.0 release completed. It would be ideal to have a test file from your system that has the problematic behavior if at all possible. It is somewhat likely that the raw/skewed data is not supported by libCZI. I might be able to patch libCZI to make that work but that's an unknown.

Thanks @heeler https://github.com/heeler

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/AllenCellModeling/aicspylibczi/issues/81#issuecomment-828509500, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADJQ2IWO7YACYUH43YBBGBTTLAMV5ANCNFSM43WSCIFA .

pr4deepr commented 3 years ago

Hi @heeler

Please find the data here. Its a WeTransfer link. I can use the czi2tif option from here: https://github.com/cgohlke/czifile to convert small czi files into tiff files, and can access the metadata. But, its only sensible for small files...

Cheers Pradeep

evamaxfield commented 3 years ago

Hey @pr4deepr I just saw your talk on Dask Summit and it served as a reminder for me to check this issue :sweat_smile: (sorry for the delay)!

@heeler has unfortunately taken a new job so I will have to get myself caught up on what is going on with this issue. I am curious if you have encountered the chunking problem on other file formats. Is it just CZI or does it affect the whole aicsimageio lib?

Excited to chat at the Dask summit life sciences workshop too!

pr4deepr commented 3 years ago

Hey @JacksonMaxfield

I just saw your talk. I really enjoyed it and I think it answered some questions that I had about processing the large datasets and memory usage!!

_

I just saw your talk on Dask Summit and it served as a reminder for me to check this issue 😅 (sorry for the delay)!

_

Apologies, didn't mean to put up the github issue like that, I just wanted to showcase my workflow and where I'm at.

Currently, I have only tried it on CZI files. I can access the CZI file and explore the metadata, subblocks using the czifile library. I get the error only when I try to read it in using aicspylibczi or aicsimageio libraries, especially as a dask array.

We mainly use Zeiss microscopes here and particularly the Zeiss Lattice in this case. I haven't tried it on other file-formats. We have a home-built lattice , so I can try it on the tiff files that it churns out? Will that work for you?

Will be great to chat with you. Which or what time will you be attending the life science workshop?

evamaxfield commented 3 years ago

Apologies, didn't mean to put up the github issue like that, I just wanted to showcase my workflow and where I'm at.

No worries at all. It was a helpful reminder for me and useful to hear about the issues.

Hmmm well normally I would say can you upgrade to aicsimageio 4.0.0.dev6 but CZI reading hasn't made it into that dev release yet. I tried, and we have benchmarks that show our peak memory used during reading files (and I manually ran some tests last night) to make sure that at least TIFFs we aren't reading more data into memory than asked - 4.0.0 benchmarks. If you click on AICSImageIO peakmem benchmarks. You can see that cached_array vs delayed_array are much different in MBs read during the process. But, I will continue to look into the memory issue.

Also note that from your talk, the Dask array jupyter / html repr that shows size isn't showing the size of bytes already read. Just the size of all the chunks combined.


Now, on to your current issue. I will manually give your file a go on the newest release of aicspylibczi and see if I can find anything.

And lastly, I will try to go to both life science workshop sessions but will for sure be at the first one. (May 19, 16:00 PST / May 19, 23:00 UTC).

pr4deepr commented 3 years ago

Thanks a lot for looking into this..

The WeTransfer link expired, so I''m posting another link here: DOWNLOAD

evamaxfield commented 3 years ago

Had a brief moment to look at this this morning. On both the prior and new versions of aicspylibczi it produces the error you noticed. So reproducible! Yay?

What an odd error. I will try to find a time to talk to Jamie about this and see what I can do. I assume it's something to do with typing. Because the underlying reader is written in C++ I wonder if your file has a different type return for some operation which is causing it to say it has no impl for those specific types.

pr4deepr commented 3 years ago

Yea, I had a look and realised the reader is in C++, which is where I hit a wall!!!

So, I was comparing a raw data file and the corresponding deskewed/processed data. The latter opens in aicsimageio. I have been playing around with using CziFile to explore the underlying data structure.

From what I understand about czi files, the data are in subblocks, which are in turn contained in subblock directories.
Using info from the code here: https://github.com/cgohlke/czifile/blob/a70265fd430983875bf4c31955f2ad57f2592747/czifile/czifile.py#L644

I can access each subblock which contains the image data. This can be accessed using data_segment() This is my understanding of the czi file.

so, if I look at the first subblock:

czi_raw = CziFile("RAW DATA.czi")

""""Read, decode, and copy subblock data from first subblock."""
subblock =czi_raw.filtered_subblock_directory[0].data_segment()

from tifffile import FileHandle
fh_raw=FileHandle(img_raw) #handling binary files within czi files

fh_raw.seek(subblock.data_offset) #set the files current position at this sublock; set the pointer at this subblock for reading
dtype=np.dtype(subblock.dtype)
data = fh_raw.read_array(dtype, subblock.data_size // dtype.itemsize)
czi_image_raw=data.reshape(czi_raw.filtered_subblock_directory[0].stored_shape)

What information would be valuable to compare the raw and deskewed data?

pr4deepr commented 3 years ago

BTW, are you comfortable with me posting this in the image.sc forum? I am in a workshop with Sebastian Rhodes from Zeiss and he mentioned about posting it there.

evamaxfield commented 3 years ago

Please do! More eyes the better probably.

evamaxfield commented 3 years ago

Hey @pr4deepr, we will try to take a deeper look at this issue soon. In fact @heeler may find some time to do soon :tada:. But other than that, no real update unfortunately, just "this issue is still on our radar" :/

pr4deepr commented 3 years ago

Thanks for that @JacksonMaxfield and @heeler ! Appreciate you taking the time for this...

evamaxfield commented 3 years ago

Hey @pr4deepr just pinging again to say that don't worry we are still tracking this issue but no development has occurred unfortunately still. Hoping that we can look at it soon but again, no real timeline unfortunately. Apologies.

pr4deepr commented 3 years ago

Thanks!

toloudis commented 3 years ago

Initial finding: the error message "The method or operation is not implemented." comes from the underlying Zeiss libCZI when it thinks there is an internal compression format it doesn't recognize. It recognizes "JpgXr" and "UnCompressed" according to the code. I am still looking deeper to see how it got there.

toloudis commented 3 years ago

Looks like the file contains compression mode 1001 which the libCZI library doesn't recognize and considers "invalid".

pr4deepr commented 3 years ago

Thanks for this update.
Glad to see that you've figured out why we're getting the error.

toloudis commented 3 years ago

https://github.com/zeiss-microscopy/libCZI/issues/54#issuecomment-874072385

pr4deepr commented 3 years ago

Hi Just updating this thread. There was a bit of delay in getting my hands on some czi files. With files saved using the newest version of Zen software (3.4 onwards), aicsimageio reads the czi files without a problem. For older files, I need to resave it using Save As CZI option on Zen to be able to read it using aicsimageio library.

I really appreciate the rapid response and help in this matter.

Do let me know if there is anything else I need to provide

Cheers Pradeep

evamaxfield commented 3 years ago

Well glad it was solved. Going to close this issue for now then. If it comes up again / if any other issues crop just let us know.