NEUBIAS / training-resources

Resources for teaching/preparing to teach bioimage analysis
https://neubias.github.io/training-resources
Other
45 stars 21 forks source link

Jupyter notebook to teach on how to handle different file formats with python #705

Open felixS27 opened 1 month ago

felixS27 commented 1 month ago

I talked with @tischi three weeks ago, about extending the module 'Image file formats' (https://neubias.github.io/training-resources/image_file_formats/index.html) to include loading images into jupyter notebooks and handling their metadata. So I created a notebook which uses the package 'aicsimageio' to load different types of image files and reads their metadata. The nice thing about this module is that it is simple to use and gives a rather uniform way of accessing the image data and some important metadata like pixel/voxel sizes. My proposed notebook does not copies the steps done with ImageJ, but it covers the following points:

Please let me know if this notebook is useful for upcoming teaching, if it meets the overall teaching material standards and if there are things which should be improved, extended. Personally, I work mainly with .nd2 files (which are not covered in the notebook, but which I could easily add, if there are some example files) in terms of loading and reading/using metadata from these files. Although it is pretty straight forward to handle different file formats, I have not been much exposed to handle other file formats in my daily routine. While creating the notebook, I stumble over following points, which I want to discuss:

I hope this notebook is a good starting point to extend the module 'Image file formats' and I am happy about all kind of constructive feedback and further ideas. # LoadingImageFiles.md

felixS27 commented 1 month ago

I just realized that there are some open and heavily discussed issues to the topic I describe here (see #572 , #462 and #471). So I hope that I started a new issue is not a big problem.

tischi commented 1 month ago

Thanks a lot @felixS27 !

For convenience I just pasted the markdown text here below:


Image file formats

To execute this notebook create following minimal environment:

conda create -n ImageFileFormats python=3.10 numpy
activate ImageFileFormat
pip install aicsimageio=4.14.0 



To read .lif files

pip install readlif=0.6.5



To read .czi files

pip install aicspylibczi=3.2.0 fsspec=2023.6.0





For more information on the module aicsimageio and further supported file formats please refer to: https://allencellmodeling.github.io/aicsimageio/index.html#

Load image

Image with minimal metadata and .tif file format

# load image with minimal metadata
from aicsimageio import AICSImage
image_url = "https://github.com/NEUBIAS/training-resources/raw/master/image_data/xy_8bit__nuclei_PLK1_control.tif"
aicsimage_object = AICSImage(image_url)
print(aicsimage_object)
print(type(aicsimage_object))
<AICSImage [Reader: TiffReader, Image-is-in-Memory: False]>
<class 'aicsimageio.aics_image.AICSImage'>

AICSImage always returns an 'AICSImage' object.
AICSImage will internally always check the image file format and then uses the appropiate reader (see 'Reader').
Per default AICSImage will not directly load the image, but rather a lazy representation of that image (see 'Image-is-in-Memory').

# Inspect  dimenions of the object
print(aicsimage_object.dims)
print(aicsimage_object.shape)
print(f'Dimension order is: {aicsimage_object.dims.order}')
print(type(aicsimage_object.dims.order))
print(f'Size of X dimension is: {aicsimage_object.dims.X}')
<Dimensions [T: 1, C: 1, Z: 1, Y: 682, X: 682]>
(1, 1, 1, 682, 682)
Dimension order is: TCZYX
<class 'str'>
Size of X dimension is: 682

AICSImage object are per default 5-dimensional with the order Time, Channels, Z dimension, Y dimension, X dimension

# Access image data
image_data = aicsimage_object.data

# Inspect image type
print(type(image_data))

print(image_data)

#Inspect image shape
print(image_data.shape)
<class 'numpy.ndarray'>
[[[[[1 1 4 ... 0 0 0]
    [1 2 1 ... 0 0 0]
    [3 0 2 ... 0 3 0]
    ...
    [0 0 0 ... 0 0 0]
    [0 0 0 ... 2 0 0]
    [0 0 0 ... 0 0 0]]]]]
(1, 1, 1, 682, 682)

With AICSImage.data, the actual image data is loaded as a 5-dimensional numpy.array (means, all missing dimenions are just empty)

# Access specific portion of image data
yx_image_data = aicsimage_object.get_image_data('YX')

# Inspect image type
print(type(yx_image_data))

print(yx_image_data)

#Inspect image shape
print(yx_image_data.shape)
<class 'numpy.ndarray'>
[[1 1 4 ... 0 0 0]
 [1 2 1 ... 0 0 0]
 [3 0 2 ... 0 3 0]
 ...
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 2 0 0]
 [0 0 0 ... 0 0 0]]
(682, 682)

With AICSImage.get_image_data it is possible to specify the image representing numpy.array
Internally, the whole 5-dimensional image is loaded and then sliced according to the specifictions.

# Inspect pixel size of image
import numpy as np
print(aicsimage_object.physical_pixel_sizes)
print(f'An pixel has a length of {np.round(aicsimage_object.physical_pixel_sizes.X,2)} microns in X dimension.')
PhysicalPixelSizes(Z=None, Y=0.16605318318140297, X=0.16605318318140297)
An pixel has a length of 0.17 microns in X dimension.
# Inspect image metadata
print(type(aicsimage_object.metadata))

print(aicsimage_object.metadata)
<class 'str'>
ImageJ=1.53c
unit=micron
finterval=299.35504150390625
min=1.0
max=125.0

Image with extensive metadata and .tif file format

# load image with extensive metadata
image_url = "https://github.com/NEUBIAS/training-resources/raw/master/image_data/xy_16bit__collagen.md.tif"
aicsimage_object = AICSImage(image_url)
print(aicsimage_object)
print(type(aicsimage_object))
<AICSImage [Reader: TiffReader, Image-is-in-Memory: False]>
<class 'aicsimageio.aics_image.AICSImage'>
# Inspect  dimenions of the object
print(aicsimage_object.dims)
<Dimensions [T: 1, C: 1, Z: 1, Y: 2160, X: 2160]>
# Access image data
image_data = aicsimage_object.data

# Inspect image type
print(type(image_data))

print(image_data)

#Inspect image shape
print(image_data.shape)
<class 'numpy.ndarray'>
[[[[[400 428 371 ... 548 655 713]
    [433 354 362 ... 566 559 602]
    [407 401 406 ... 559 551 539]
    ...
    [410 390 390 ... 464 476 462]
    [412 434 424 ... 558 656 594]
    [430 504 492 ... 684 933 886]]]]]
(1, 1, 1, 2160, 2160)
# Inspect pixel size of image
print(aicsimage_object.physical_pixel_sizes)
print(f'An pixel has a length of {np.round(aicsimage_object.physical_pixel_sizes.X,2)} microns in X dimension.')
PhysicalPixelSizes(Z=None, Y=352.77777777777777, X=352.77777777777777)
An pixel has a length of 352.78 microns in X dimension.
# Inspect image metadata
print(type(aicsimage_object.metadata))

print(aicsimage_object.metadata)
<class 'str'>
Experiment base name:Karim-240723-005
Experiment set:Nadine

Exposure: 600 ms
Binning: 1 x 1
Region: 2160 x 2160, offset at (200, 0)
Acquired from AndorSdk3 Camera
Subtract: Off
Shading: Off
Digitizer: 200 MHz - lowest noise
Gain: 16-bit (low noise & high well capacity)
Electronic Shutter: Rolling
Baseline Clamp Enabled: Yes
Cooler On: 1
Frames to Average: 1
Trigger Mode: Normal (TIMED)
Temperature: -0.44

Image with .lif file format (Leica image file)

# load image from Leica microscope file format
image_url = "https://github.com/NEUBIAS/training-resources/raw/master/image_data/xy_xyc__two_images.lif"
aicsimage_object = AICSImage(image_url)
print(aicsimage_object)
print(type(aicsimage_object))
<AICSImage [Reader: LifReader, Image-is-in-Memory: False]>
<class 'aicsimageio.aics_image.AICSImage'>
# Inspect  dimenions of the object
print(aicsimage_object.dims)
<Dimensions [T: 1, C: 4, Z: 1, Y: 1024, X: 1024]>
# Access all 4 channels
img_4channel = aicsimage_object.data

# Alternative
img_4channel = aicsimage_object.get_image_data('CYX')

#Inspect image type and shape
print(type(img_4channel))
print(img_4channel.shape)

#Access only one specific channel
img_1channel = aicsimage_object.get_image_data('YX',C=0)

#Inspect image type and shape
print(type(img_1channel))
print(img_1channel.shape)
<class 'numpy.ndarray'>
(4, 1024, 1024)
<class 'numpy.ndarray'>
(1024, 1024)
# Inspect scene metadata
print(aicsimage_object.scenes)

# Get current scene
print(f'Current scene: {aicsimage_object.current_scene}')
('Series001', 'Image004')
Current scene: Series001
# Explore different scenes

# Select first scene (0 (!) as python is zero indexed)
aicsimage_object.set_scene(0)
img_scene1 = aicsimage_object.data
print(f'Image shape, first scene: {img_scene1.shape}')

# Select second scene
aicsimage_object.set_scene(1)
img_scene2 = aicsimage_object.data
print(f'Image shape, second scene: {img_scene2.shape}')
Image shape, first scene: (1, 4, 1, 1024, 1024)
Image shape, second scene: (1, 1, 1, 512, 512)
# Inspect pixel size of image
print(aicsimage_object.physical_pixel_sizes)
print(f'An pixel has a length of {np.round(aicsimage_object.physical_pixel_sizes.X,2)} microns in X dimension.')

# Inspect channel metadata
aicsimage_object.set_scene(0)
print(aicsimage_object.channel_names)
PhysicalPixelSizes(Z=None, Y=0.3613219178082192, X=0.3613219178082192)
An pixel has a length of 0.36 microns in X dimension.
['Blue', 'Green', 'Yellow', 'Red']
# Inspect image metadata
print(type(aicsimage_object.metadata))

print(aicsimage_object.metadata)
<class 'xml.etree.ElementTree.Element'>
<Element 'LMSDataContainerHeader' at 0x177226930>

Image with .czi file format (Carl Zeiss image)

# load image from Zeiss microscope file format
# Can't be loaded from url!
# download image from https://github.com/NEUBIAS/training-resources/raw/master/image_data/xyz__multiple_images.czi
aicsimage_object = AICSImage('xyz__multiple_images.czi')
print(aicsimage_object)
print(type(aicsimage_object))
<AICSImage [Reader: CziReader, Image-is-in-Memory: False]>
<class 'aicsimageio.aics_image.AICSImage'>
# Inspect  dimenions of the object
print(aicsimage_object.dims)
<Dimensions [T: 1, C: 1, Z: 2, Y: 251, X: 251]>
# Access all 3 dimensions
img_3d = aicsimage_object.data

# Alternative
img_3d = aicsimage_object.get_image_data('ZYX')

#Inspect image type and shape
print(type(img_3d))
print(img_3d.shape)

#Access only z plane
img_2d = aicsimage_object.get_image_data('YX',Z=0)

#Inspect image type and shape
print(type(img_2d))
print(img_2d.shape)
<class 'numpy.ndarray'>
(2, 251, 251)
<class 'numpy.ndarray'>
(251, 251)
# Inspect scene metadata
print(aicsimage_object.scenes)

# Get current scene
print(f'Current scene: {aicsimage_object.current_scene}')
('xyz__multiple_images-0', 'xyz__multiple_images-1')
Current scene: xyz__multiple_images-0
# Inspect pixel size of image
print(aicsimage_object.physical_pixel_sizes)
print(f'An pixel has a length of {np.round(aicsimage_object.physical_pixel_sizes.X,2)} microns in X dimension.')

# Inspect channel metadata
print(aicsimage_object.channel_names)
PhysicalPixelSizes(Z=0.3, Y=0.19564437607395324, X=0.19564437607395324)
An pixel has a length of 0.2 microns in X dimension.
['ChA']
# Inspect image metadata
print(type(aicsimage_object.metadata))

print(aicsimage_object.metadata)
<class 'xml.etree.ElementTree.Element'>
<Element 'ImageDocument' at 0x177227880>

Save image

# Load first example image
image_url = "https://github.com/NEUBIAS/training-resources/raw/master/image_data/xy_8bit__nuclei_PLK1_control.tif"
aicsimage_object = AICSImage(image_url)
print(aicsimage_object.physical_pixel_sizes)
PhysicalPixelSizes(Z=None, Y=0.16605318318140297, X=0.16605318318140297)

Option 1

# Save aicsimage object directly as .ome.tif
aicsimage_object.save('option1.ome.tif')

# Re-load image to check on availability of pixel metadata
print(AICSImage('option1.ome.tif').physical_pixel_sizes)
PhysicalPixelSizes(Z=None, Y=0.16605318318140297, X=0.16605318318140297)

Option 2

# Save numpy.array as .ome.tif
from aicsimageio.writers import OmeTiffWriter

img_data = aicsimage_object.get_image_data('YX')

# Inspect image shape
print(img_data.shape)

OmeTiffWriter.save(img_data,
                   'option2.ome.tif',
                   dim_order='YX',
                   physical_pixel_sizes=aicsimage_object.physical_pixel_sizes)

# Re-load image to check on availability of pixel metadata
print(AICSImage('option2.ome.tif').physical_pixel_sizes)
(682, 682)
PhysicalPixelSizes(Z=None, Y=0.16605318318140297, X=0.16605318318140297)

Option 3

# Save numpy.array as .ome.zarr
from aicsimageio.writers import OmeZarrWriter

OmeZarrWriter('option3.ome.zarr').write_image(
    img_data,
    image_name='Option3',
    channel_names=None,
    channel_colors=None,
    dimension_order='YX',
    physical_pixel_sizes=aicsimage_object.physical_pixel_sizes
)

Option 4

# Save numpy.array directly

np.save('option4.npy',img_data)

# load .npy files

reloaded_img = np.load('option4.npy')

# Check if they are the same

print(f'Are the dimensions the same: {np.all(img_data.shape == reloaded_img.shape)}')
print(f'Are the images the same: {np.all(img_data == reloaded_img)}')
Are the dimensions the same: True
Are the images the same: True
tischi commented 1 month ago

@felixS27 does one really need to explicitly install numpy?

conda create -n ImageFileFormats python=3.10 numpy activate ImageFileFormat pip install aicsimageio=4.14.0

If so, should it now be numpy<2.0 ?

felixS27 commented 1 month ago

No, sorry, my mistake. Numpy will also be installed when installing aicsimageio. So no explicit installing of numpy.

felixS27 commented 1 month ago

They migrated AICSImageIO to BioIO recently and set AICSImageIO to maintenance. The interface is essentially the same (except they switched AICSImage to BioImage) so accessing all the data is still possible with the usual keywords (https://bioio-devs.github.io/bioio/MIGRATION.html). However, the package is now more modular meaning that you have to install the right plugins along with the main module, depending on which image formats you want to read. According to them it has some advantages in terms of dependencies. My question would be, if I should update the notebook to use BioIO instead of AICSImageIO? Or just mention it at the end of the notebook?

tischi commented 1 month ago

Thanks @felixS27 !

Since we do not have a course upcoming very soon I would suggest to implement the forward-looking BioIO API.

Also your markdown document should be reformatted into a python script, which should be added here via a PR.

See, e.g. here for how this is done for other teaching modules.

felixS27 commented 1 month ago

Sure. I will update the notebook and convert it into a python script and add it to the repo.

tischi commented 1 month ago
["image_file_formats/open_diverse_file_formats.md", [["ImageJ GUI", "image_file_formats/open_diverse_file_formats_imagejgui.md", "markdown"]]]
["image_file_formats/open_diverse_file_formats.md", [["ImageJ GUI", "image_file_formats/open_diverse_file_formats_imagejgui.md"],["python BioIO", "image_file_formats/open_diverse_file_formats_bioio.py"]]]