Closed pilarOrtega closed 3 years ago
In the first slide it seems that the grid cells just get too small compared to the actual resolution of the thumbnail and the line thickness. If thumbnail is taken at level 9 and patches extracted at level 0, for a slide of dimensions 90000x200000:
I think a good workaround would be to use slide.get_thumbnail
instead of slide.read_region
to generate the thumbnail. It would let us specify a target resolution that is large enough for each patch to have several pixels assigned. I would do something like:
def preview_from_queries(
slide: openslide.OpenSlide,
queries: Sequence[Patch],
min_res: int = 512,
color: Tuple[int, int, int] = (255, 255, 0),
thickness: int = 3,
cell_size: int = 3,
) -> NDByteImage:
"""
Give thumbnail with patches displayed.
Args:
slide: openslide object
queries: patch queries {"x", "y", "dx", "dy", "level"}
min_res: minimum size for the smallest side of the thumbnail (usually the width)
color: rgb color for patch boundaries
thickness: thickness of patch boundaries
cell_size: size of a cell representing a patch in the grid
Returns:
Thumbnail image with patches displayed.
"""
# get thumbnail first
w, h = slide.dimensions
dx = queries[0]["dx"]
dy = queries[0]["dy"]
thumb_w = max(512, (w // dx)*(thickness + cell_size)+thickness)
thumb_h = max(512, (h // dy)*(thickness + cell_size)+thickness)
image = slide.get_thumbnail((thumb_w, thumb_h))
thumb_w, thumb_h = image.size
dsr_w = w / thumb_w
dsr_h = h / thumb_h
image = numpy.array(image)[:, :, 0:3]
# get grid
grid = 255 * numpy.ones((thumb_h, thumb_w), numpy.uint8)
for query in queries:
# position in queries are absolute
x = int(query["x"] / dsr_w)
y = int(query["y"] / dsr_h)
dx = int(query["dx"] / dsr_w)
dy = int(query["dy"] / dsr_h)
startx = min(x, thumb_w - 1)
starty = min(y, thumb_h - 1)
endx = min(x + dx, thumb_w - 1)
endy = min(y + dy, thumb_h - 1)
# horizontal segments
grid[starty, startx:endx] = 0
grid[endy, startx:endx] = 0
# vertical segments
grid[starty:endy, startx] = 0
grid[starty:endy, endx] = 0
grid = grid < 255
d = disk(thickness)
grid = binary_dilation(grid, d)
image[grid] = color
return image
Note that:
dsr_w = dsr_h
as get_thumbnail
preserves the aspect ratio, but I'd rather use a safe option on this.dx
and dy
is the same for every query (which is normally the case as patch_size
is always the same). I think it would be better to have patch_size
as an argumenth of this function though.dx
and dy). If we want to take overlap into account, I suggest passing
intervalas an argument to this and adapt the formulas for
thumb_wand
thumb_h`. But I think that generating grids that look fine for overlapping patches would be a pain in the a** anyway.2*thickness+cell_size
(with each square overlaping the next one). If patches are not square this will not work properly. To make it work we could just have cell_size
multiplied by max(dx, dy)/min(dx, dy)
in the formula for the largest side of the thumbnail.However, all of the above comments are not that important as the core of the code stays the same. What changes is the evaluation of the needed resolution for the thumbnail to actually display a grid. At worst it is a bit approximative and we find ourselves in similar edge cases as before, they would just become less common. Depending on how precise we want to be this code can be more or less convoluted.
Thanks Robin! Indeed, the thumbnail was not big enough for all patches to have one pixel, let alone create a grid with a given thickness on top.
Even with a bigger thumbnail size, for lower levels the grid is not visible. That can be solved by increasing the default cell_size
parameter so that it does not disappear when dilating the grid (if we make it 10 or 20 times the thickness the grid is visible at all levels, though the thumbnail is heavier). It would not change anything for thumbnails smaller than 512 px, but I think is nicer for smaller levels.
dx
and dy
should always be the same, but still I agree with you we may better include this as an argument, in case it is ever needed.
And for the moment, I don't think it is worth it to include a grid for overlapping patches - depending on the overlap it risks of being a messy bunch of lines which I don't think is interesting either... I believe it might be more interesting to show a grid which only display the average borders between patches.
Still, the grid is just to have a little overview, so its not really essential that its perfect in all edge cases. It's just nice to have some little map to know where patches are :)
When we patchify a slide using Pathaia (with verbose >= 2) a thumbnail at each level of extraction is obtained with a grid showing the extracted patches (in the image, what it should look like)
When extracting patches from slides of a different cohort, at levels 2, 1 at 0, the thumbnails obtained look like this:
Level 1 and 2 have the same aspect as level 0 from the previous slide, while in level 0 we do not even have grid display. It may be due to the dilation of the grid to set the line width.