computationalpathologygroup / pythostitcher

Python tool for stitching histopathology tissue fragments into artificial whole-mounts.
GNU Lesser General Public License v2.1
16 stars 4 forks source link

removing first layers of .tif / creating hand segmented mask #3

Closed StefanBst closed 2 weeks ago

StefanBst commented 2 weeks ago

Hi!

First of all, thank you for making PythoStitcher public and out-of-the-box usable with the Docker container!

The examples provided work perfectly, but I wanted to try it on my own data, which, unfortunately, is much larger than the examples.

I tried removing the first few layers using PIL to eliminate the super high-resolution pages and then saved it again, making sure to copy the metadata. However, every time I save it, mir.MultiResolutionImageReader() returns NoneType when reading the file.

Do you know why this might be happening? Does mir.MultiResolutionImageReader() require additional information besides DPI, compression, resolution, or TIFF metadata to read the file properly?

Additionally, since I want to use my own labeled masks, I'm struggling to figure out how to create a multi-resolution mask .tif file that mir.MultiResolutionImageReader() can read.

Any suggestions on how to tackle this problem? Has anyone else faced a similar issue?

Thanks a lot!

Best regards, Stefan

dnschouten commented 2 weeks ago

Hi Stefan,

The mir.MultiResolutionImageReader() uses openslide as backend and will fail to read a slide if it is not a pyramidal tif file. I've never used PIL for whole slide images, but I'd assume it doesn't save them as regular pyramidal tif files which seems to be causing the problem here.

You should be able to fix all issues if you use Pyvips (which is included in the Docker) to modify your images/masks into a workable format. For the images, you can read a specific page as Pyvips Image, and directly save it as pyramidal file, which effectively removes the highest resolution layers. You can use the same approach to obtain a readable multi-resolution version of your masks. I make quite extensive use of Pyvips throughout the code, so you can just search the repo for some examples on how to load/save images with Pyvips or check out the documentation (https://libvips.github.io/pyvips/vimage.html).

By the way, if you do not need the highest resolution stitched version, you can also specify the output resolution in µm/px with the --resolution argument. It defaults to 0.25 µm/px, which is the highest resolution for the sample data, but you can also use any 2x increment (i.e. 0.125, 0.5, 8.0). PythoStitcher will then only read and save slides at that specific resolution, which can lead to great speed-ups.

Please let me know if the above solved your problems. If not, perhaps you can provide me with some more info regarding your data so I can dig a bit deeper what's causing this.

Best, Daan

StefanBst commented 2 weeks ago

Hi Daan,

Thank you for the fast answer and detailed explanation. Everything you mentioned worked out perfectly. I loaded a lower-resolution page with Pyvips and saved it again as pyramidal. I did the same for my mask, and both look exactly the same as the artificial example you added to your repo.

The issue I'm facing now is that when I start running the script with my data, I get the following error: Traceback (most recent call last): File "/home/user/pythostitcher-0.3.1/src/main.py", line 287, in main() File "/home/user/pythostitcher-0.3.1/src/main.py", line 267, in main run_case(data_dir, save_dir, output_res) File "/home/user/pythostitcher-0.3.1/src/main.py", line 241, in run_case generate_full_res(parameters=parameters, log=log) File "/home/user/pythostitcher-0.3.1/src/pythostitcher_utils/full_resolution.py", line 518, in generate_full_res f.get_scaling() File "/home/user/pythostitcher-0.3.1/src/pythostitcher_utils/full_resolution.py", line 94, in get_scaling res_per_level = [ File "/home/user/pythostitcher-0.3.1/src/pythostitcher_utils/full_resolution.py", line 95, in self.raw_image.getSpacing()[0] * scale IndexError: tuple index out of range

I downloaded the ASAP tool you published under the Computational Pathology Group and loaded one of your examples. It gives me a mm or µm bar on the bottom left, which indicates that it knows the spacing. However, when I load my own image / .tif file, it just says "pixels," so it seems there is no spacing encoded in my .tif file, and there is no spacing attribute in the metadata of pyvips.Image. (Also happens if I dont load a page and downscale the image, so probably the device is missing that information or saving it differently)

Image Metadata:

Width: 58496
Height: 82304
Bands: 3
Format: uchar
width: 58496
height: 82304
bands: 3
format: uchar
coding: none
interpretation: srgb
xoffset: 0
yoffset: 0
xres: 4000.0
yres: 4000.0
filename: */data/artificial/raw_images/fragment1.tif
vips-loader: tiffload
n-pages: 11
resolution-unit: cm
orientation: 1
vips-sequential: 1

All the metadata above is also present in my image. My question is, do you have a solution for this? Or do you have any idea how to add the spacing parameter to my .tif file?

Thanks again for all your help!

Best regards, Stefan

dnschouten commented 2 weeks ago

Hi Stephan,

Ah apologies, I should have mentioned to explicitly set the resolution of your .tif file before saving. If no resolution metadata is found, PythoStitcher will indeed not be able to figure out how it needs to scale the images with the transformations.

There is some discrepancy in how openslide and pyvips handle resolution metadata, but the following code snippet should work to infuse your .tif files with the required metadata:

original_spacing = 0.25 # resolution in µm/px, replace with image resolution of your original files
xyres = 1000 / original_spacing # conversion step for compatibility with pyvips
pyvips_image_to_save = pyvips_image.copy(xres=xyres, yres=xyres) # make a new image to insert this resolution metadata
pyvips_image_to_save.write_to_file(filepath) # save as your normally would

If you regenerate the .tif of both the images and the masks I believe this should solve your problem. Btw using ASAP to ensure that a size indicator is loaded is indeed a smart way to verify that the metadata is interpreted correctly!

Please let me know should you still encounter any issues down the line.

Best, Daan

StefanBst commented 2 weeks ago

Hi Daan, Sorry to bother you again. When I follow your instructions, I can see the change in xyres in the metadata of PyVisp

Before

Image Metadata: Width: 22847 Height: 51381 Bands: 3 Format: uchar width: 22847 height: 51381 bands: 3 format: uchar coding: none interpretation: srgb xoffset: 0 yoffset: 0 xres: 8237.3203125 yres: 8237.3203125 filename: */raw_images/image_11.tif vips-loader: tiffload n-pages: 9

After:

Image Metadata: Width: 22847 Height: 51381 Bands: 3 Format: uchar width: 22847 height: 51381 bands: 3 format: uchar coding: none interpretation: srgb xoffset: 0 yoffset: 0 xres: 2000.0 yres: 2000.0 filename: */raw_images/image_11_with_spacing.tif vips-loader: tiffload n-pages: 9

This is how I used it: image = pyvips.Image.new_from_file(input_path, page=3) original_spacing = 0.5 # first lvl is 0.125, second 0.25, third 0.5 µm/px xyres = 1000 / original_spacing image_to_save = image.copy(xres=xyres,yres=xyres) image_to_save.write_to_file(output_path,pyramid=True, compression="jpeg")

But I get the same error and I cant verify it with ASAP.

Kind regards, Stefan

dnschouten commented 2 weeks ago

Hmm that's strange. The metadata change in xres/yres seems reasonable and the fact that it can still call methods like getNumberOfLevels() (in line 93 right before the stack trace) seems to suggest that most of the metadata works fine.

Have you already played around with saving the example files at different resolutions and loading it with the ASAP backend to verify that works as expected? So, save the example file as you normally would with pyvips and then load it with:

opener = mir.MultiResolutionImageReader()
image = opener.open(str(image_path))
spacing = image.getSpacing()

If the rest of PythoStitcher works with these modified example files the code snippet to modify the xres should not be the culprit. Otherwise, if this results in the same error, the code to save the image is probably the issue.

And just to double-check, could you let me know if there are any blockers to using the --resolution argument when you run the container? If I understand correctly, that should accomplish the same thing as using a lower resolution version of your own files, or am I missing something here?

StefanBst commented 2 weeks ago

Hi Daan,

Thanks for all the support! I was able to solve the problem.

The issue turned out to be that the scanner used to generate the whole-slide .tif files had saved all the metadata in XML format. As a result, I could retrieve the number of levels with .getNumberOfLevels(), but I couldn't access the spacing information with .getSpacing(). Even after changing the resolution with PyVips, the XML metadata remained, which caused confusion for OpenSlide and mir.MultiResolutionImageReader().

My quick solution was to convert the image to a NumPy array and then back to PyVips, which effectively dropped the problematic XML data. I'll likely find a more elegant way to handle this in the future, but for now, everything is working perfectly!

image_path = "*/raw_images/my_tif_with_xml.tif"
image = pyvips.Image.new_from_file(image_path,page=3)
image_array = image.numpy()
back_to_pyvips = pyvips.Image.new_from_array(image_array)
original_spacing = 1.0
xyres = 1000 / original_spacing
image_to_save = back_to_pyvips.copy(xres=xyres,yres=xyres)
image_to_save.write_to_file( path_to_save, pyramid=True, compression="jpeg")

Thanks again for the open-source tool and your ongoing support!

Best regards, Stefan

dnschouten commented 2 weeks ago

Glad to hear you solved it and everything's working as expected now, then I'll go ahead and close this issue.

If you encounter any further/new problems in the future, feel free to open another issue so we can continue improving PythoStitcher :)