computationalpathologygroup / ASAP

Program for the analysis and visualization of whole-slide images in digital pathology
https://computationalpathologygroup.github.io/ASAP/
GNU General Public License v2.0
613 stars 164 forks source link

return None in case of OpenSlide error #255

Closed peterbandi closed 1 year ago

peterbandi commented 1 year ago

This PR addresses 3 issues. In order of importance:

1 Invalid MRXS image reading

An invalid MRXS image pushes the multiresolutionimageinterface to a state where it only returns empty (white) patches. Unfortunately, completely white patches can be returned from valid MRXS files so there is no way to detect such failures from outside the library.

As described here the once the openslide_get_error function function returns a non-NULL value, the only useful operation on the object is to call openslide_close. Unfortunately, this is not checked in the OpenSlideImage class after calling openslide_read_region. If an error occurs or has occurred, the openslide_read_region only clears the output array, hence the completely white patches once the faulty tile encountered.

As a fix the checking of the openslide_get_error function is added to the OpenSlideImage. In case of error OpenSlideImage returns NULL Furthermore, the MultiResolutionImage::getRawRegion functions also return NULL in case of error that is forwarded to the python interface as None clearly indicating that the reader got into an invalid state.

I can possibly provide sample image to reproduce the error. Reading from a specific location at level 0 of this image makes openslide fail:

import openslide
image = openslide.open_slide("image.mrxs")
patch = image.read_region(location=(148992, 250880), level=0, size=(512, 512))
>>> openslide.lowlevel.OpenSlideError: Corrupt JPEG data: premature end of data segment

while reading from this location with multiresolutionimageinterface results an all white patch and from this point only white patches are returned on further read calls:

import multiresolutionimageinterface as mri
image_reader = mri.MultiResolutionImageReader().open("image.mrxs")
image_patch = image_reader.getUCharPatch(148992, 250880, 512, 512, 0)

2 Missing include and namespace

The include of <cmath> and the use of std:: namespace in some pow and sqrt calls were missing in from ImageScopeRepository.cpp making the no-gui builds fail.

3 Use of pugixml from apt repository

I would like to make ASAP into a conda package. For that it would be beneficial if it used the standard PugiXML build (that is also present in the conda repositories) instead of the custom built one. The use of non-inline function calls probably somewhat degrade the XML reading/writing performance but probably not by much.

The use of libpugixml-dev package has been added and the use of the local pugixml build is removed. This last part is not necessary and can be omitted.