BodenmillerGroup / steinbock

A toolkit for processing multiplexed tissue images
https://bodenmillergroup.github.io/steinbock
MIT License
49 stars 14 forks source link

External OME.TIFF file segmentation , transformed from qptiff #244

Closed alexeykb closed 3 months ago

alexeykb commented 5 months ago

Hello, I am not sure, if i can modify my older question, for this reason had to create a new branch.

I am facing the issue with segmenting the ome.tiff using steinbock. I have original CODEX files in the format of qptiff. I have transformed qptiff to ome.tiff using BioFormats: **https://docs.openmicroscopy.org/bio-formats/6.10.1/users/comlinetools/index.html** What increased my original file from 25 gb to 75gb and stored it to: path/to/publication/steinbock/external I further have tried to input it to the IMC pipline (starting with steinbock), by using this command:

steinbock preprocess external images --img /data/external --imgout /data/images --infoout /data/images/images.csv

Unfortunately , it didnt work and returned the error:

WARNING steinbock.preprocessing.external - Unsupported file format: /data/external/scanone.ome.tiff

However, originally, the channel names from the ome.tiff I have extracted the channel names from the ome.tiff file and those are in the form Channel:0:0 Channel:0:1 etc

Which is not acceptable for steinbock, ERROR:

**

ValueError: invalid literal for int() with base 10: 'Channel:0:0'

**

so i changed it to the 0 1 etc btw, why it is an issue , why the name of the channel should be int ()? in example panel.csv file the cnahhel names are 'Y89', 'ln113'... (i.e. not numbers)

What should be my steps now ? Should I open ome.tiff through qptiff and manually rename channels or the problem is in ome.tiff file, and not in channels names. If it is ome.tiff fault, how can I overpass it ? I also tried to transform qptiff thorugh QPath and it didi not give any positive results either.

Thank you Best Alexey

ps i tried suggested command from the another branch teinbock preprocess external images --channels/last --img /data/external --imgout /data/images --infoout /data/images/images.csv

and it didn't work either:

Usage: steinbock preprocess external images [OPTIONS]

Try 'steinbock preprocess external images --help' for help.

Milad4849 commented 5 months ago

Please try to keep your issue focused and specific to a single question. Steinbock first attempts a tifffile-based reading of an external image and then attempts to read it via iamgeio. Your image cannot be read by either, likely something with the metadata or dimensions are not right, you have to figure out what that is. I suggest looking into the metadata and comparing that against a tiff with metadata that can be read by imageio, such as the ones that can be obtained from here.

alexeykb commented 5 months ago

Thank you for your reply! Of course, I am sorry for combining multiple question in one message.

I have extracted metadata from the ome.tiff using imageio as you have suggested and the only metadata appearing is these:

is_fluoview False is_nih False is_micromanager False is_ome True is_lsm False is_reduced 0 is_sgi False is_shaped None is_stk False is_tiled True is_mdgel False compression COMPRESSION.NONE predictor 1 is_mediacy False description <?xml version="1.0" encoding="UTF-8" standalon... description1
is_imagej None software OME Bio-Formats 7.0.1

Do I miss somethinig?

Milad4849 commented 5 months ago

I am not quite sure, what is the output of the following for your ome.tiff?

import tifffile

with tifffile.TiffFile('path/to/image') as tif:
    print("Page:", tif.pages[0].index)
    for tag in tif.pages[0].tags.values():
        print(f"{tag.name}: {tag.value}")
    print()  
img = tifffile.imread('path/to/image', squeeze=False)
img.shape

For an example steinbock generated tiff, the output is the following:

Page: 0
ImageWidth: 600
ImageLength: 600
BitsPerSample: 32
Compression: 1
PhotometricInterpretation: 1
ImageDescription: ImageJ=1.11a
images=40
channels=40
hyperstack=true
mode=grayscale
StripOffsets: (352,)
SamplesPerPixel: 1
RowsPerStrip: 600
StripByteCounts: (1440000,)
XResolution: (1, 1)
YResolution: (1, 1)
ResolutionUnit: 1
Software: tifffile.py
SampleFormat: 3

(1, 1, 40, 600, 600, 1)
alexeykb commented 5 months ago

Thank you! Yes, this script provide give much more information:

Page: 0

ImageWidth: 30720 ImageLength: 54720 BitsPerSample: 8 Compression: 1 PhotometricInterpretation: 1

ImageDescription: <?xml version="1.0" encoding="UTF-8" standalone="no"?>

urn:uuid:f388ea1a-c191-4155-8399-502d672a7509urn:uuid:f388ea1a-c191-4155-8399-502d672a7509urn:uuid:f388ea1a-c191-4155-8399-502d672a7509

And it triggers the MemoryError:

MemoryError: Unable to allocate 54.8 GiB for an array with shape (35, 54720, 30720) and data type uint8

Can the problem being caused by the big size of the file?

Milad4849 commented 5 months ago

It seems like it, can you try this script as well?

import imageio

img = imageio.volread('path/to/image')

height, width, depth = img.shape[:3]

data_type = img.dtype

num_frames = img.shape[0] if len(volume_data.shape) > 3 else 1

print("Dimensions (H x W x D):", height, "x", width, "x", depth)
print("Data type:", data_type)
print("Number of frames:", num_frames)
print("Image shape:", img.shape)
alexeykb commented 5 months ago

Thank you so much for your reply!

this script create the same error: ----> 3 img = imageio.volread('/path/to/file.tif')

MemoryError: Unable to allocate 54.8 GiB for an array with shape (35, 54720, 30720) and data type uint8

Why does it want to allocate 54.8GiB for an array of (35, 54720, 30720) ?

Milad4849 commented 5 months ago

Each uint8 is 1 byte, (35*54720*30720)bytes/(1024ˆ3) = 54.8 GB. If there is enough memory available to the Python process, you should be able to read it. With such a large image you might run into other issues downstream. You can try to reduce the image size and/or divide up the image and analyze the pieces individually.

alexeykb commented 5 months ago

Hello! Thank you for your reply and suggestions! I have selected only one well from the dataset and extracted it to ome. tiff using the script at QuPath


`// Define the name of the annotation to find
String targetAnnotationName = "Test_well"

// Attempt to find the annotation
def targetAnnotation = getAnnotationObjects().find { it.getName().equalsIgnoreCase(targetAnnotationName) }

if (targetAnnotation) {
    // Define the output file path
    String outputPath = "path/to/file/test_well.ome.tiff"

    // Get the ImageServer for the current image
    def server = getCurrentImageData().getServer()

    // Attempt to adjust the RegionRequest creation based on the found annotation's ROI
    def request = RegionRequest.createInstance(server.getPath(), 1, targetAnnotation.getROI())

    // Attempt to export the region defined by the annotation
    try {
        writeImageRegion(server, request, outputPath)
        print("Export successful: " + outputPath)
    } catch (Exception e) {
        print("Failed to export image region: " + e.getMessage())
    }
} else {
    print("Annotation named '${targetAnnotationName}' not found.")
}`

It extracted ome.tiff file with 300MB size.

I have tried to run steinbock on this new file and unfortunately got the same error:

steinbock preprocess external images --img /data/external --imgout /data/images --infoout /data/images/images.csv

2024-02-24 23:17:07,935 WARNING steinbock.preprocessing.external - Unsupported file format: /data/external/test_well.ome.tiff

I have run the tested code you suggested, and got these outputs: for first test code:

Page: 0 ImageWidth: 4422 ImageLength: 4617 BitsPerSample: 8 Compression: 5 PhotometricInterpretation: 1 ImageDescription: <?xml version="1.0" encoding="UTF-8" standalone="no"?><Channel Color="-65536" ID="Channel:0:1" Name=......

FirstZ="0" IFD="0" PlaneCount="1">urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08urn:uuid:338f8d9a-98e9-4e3a-ba58-817562d3ec08 SamplesPerPixel: 1 XResolution: (19669113, 1000) YResolution: (19669113, 1000) PlanarConfiguration: 1 ResolutionUnit: 3 Software: OME Bio-Formats 7.0.1 TileWidth: 512 TileLength: 512 TileOffsets: (1040, 43692, 176512, 85198, 238077, 345283, 432830, 539193, 508449, 806714, 722103, 579904, 848627, 1125998, 990829, 1267617, 1409320, 1462706, 1632203, 1698283, 1493281, 1837472, 1976660, 2109312, 2235684, 2364516, 2404430, 2435651, 2540335, 2679470, 2814410, 2951627, 3088199, 3207786, 3331347, 3398880, 3429531, 3683699, 3549883, 3820956, 3954423, 4086450, 4211524, 4354242, 4438937, 4469446, 4582764, 4720896, 4860256, 4998094, 5123109, 5270482, 5417280, 5505484, 5824773, 5686191, 5546540, 5902981, 6040201, 6168106, 6319660, 6452405, 6503117, 6637995, 6533879, 6682050, 6819035, 6952755, 7185344, 7088400, 7342055, 7382618, 7541373, 7498272, 7412600, 7586476, 7680844, 7858162, 7807171, 7987062, 8037000, 7979625, 7981479, 7983341, 7985312, 8027958, 8029816, 8031665, 8033610, 8035446) TileByteCounts: (42652, 41506, 61565, 91314, 107206, 87547, 75619, 40711, 30744, 41913, 84611, 142199, 142202, 141619, 135169, 141703, 53386, 30575, 66080, 139189, 138922, 139188, 132652, 126372, 128832, 39914, 31221, 104684, 139135, 134940, 137217, 136572, 119587, 123561, 67533, 30651, 120352, 137257, 133816, 133467, 132027, 125074, 142718, 84695, 30509, 113318, 138132, 139360, 137838, 125015, 147373, 146798, 88204, 41056, 78208, 138582, 139651, 137220, 127905, 151554, 132745, 50712, 30762, 44055, 104116, 136985, 133720, 135645, 156711, 96944, 40563, 29982, 45103, 43101, 85672, 94368, 126327, 121463, 50991, 40896, 30416, 1854, 1862, 1971, 1750, 1858, 1849, 1945, 1836, 1554) SampleFormat: 1

(1, 1, 35, 4617, 4422, 1)

for second test code:

Dimensions (H x W x D): 35 x 4617 x 4422 Data type: uint8 Number of frames: 1 Image shape: (35, 4617, 4422)

How should I proceed in order to find out what is exatly the issue and why the qptiff after transformation to ome.tif file is still un supported? Thank you so much

Milad4849 commented 4 months ago

Do the scripts read your file without any issues? If you can read the file using tifffile.imread() then steinbock should be able to read it too. Can you convert the file to a different format to see if that changes anything (is it possible to just save or convert to a format without the metdata of an ome.tiff?)? You can also upload a file somewhere for me to have a look at. In general, it seems that the OME-TIFF file may contain metadata or features that are not supported by the packages.

Milad4849 commented 4 months ago

Is this resolved?

Milad4849 commented 3 months ago

Closing due to lack of response. When I spend time to resolve your problem, particularly when it is not related to steinbock, I expect a timely response. I do not have time to get into an issue multiple times.