DigitalSlideArchive / HistomicsTK

A Python toolkit for pathology image analysis algorithms.
https://digitalslidearchive.github.io/HistomicsTK/
Apache License 2.0
389 stars 116 forks source link

numpy.linalg.LinAlgError: Singular matrix with color deconvolution #834

Closed sumanthratna closed 4 years ago

sumanthratna commented 4 years ago

This code is returning an error:

W_target = dask.array.from_array([
    [0.5807549,   0.08314027,  0.08213795],
    [0.71681094,  0.90081588,  0.41999816],
    [0.38588316,  0.42616716, -0.90380025]
])

stain_unmixing_routine_params = {
    'stains': ['hematoxylin', 'eosin'],
    'stain_unmixing_method': 'macenko_pca',
}

W_source = S_source.T[pairwise_distances(
    S_source.T, W_target.T).argmin(axis=1), :].T
W_target = S_target.T[pairwise_distances(
    S_target.T, W_target.T).argmin(axis=1), :].T

tissue_rgb_normalized = deconvolution_based_normalization(
    ihc_rgb,
    W_source=W_source,
    W_target=W_target,
    stain_unmixing_routine_params=stain_unmixing_routine_params,
    mask_out=masking(ihc_rgb) if use_mask else None
)

Here's the relevant part of the stack trace:

  File ".../macenko.py", line 178, in normalization
    mask_out=masking(ihc_rgb) if use_mask else None
  File ".../python3.6/site-packages/histomicstk/preprocessing/color_normalization/deconvolution_based_normalization.py", line 105, in deconvolution_based_normalization
    **stain_unmixing_routine_params)
  File ".../python3.6/site-packages/histomicstk/preprocessing/color_deconvolution/color_deconvolution.py", line 260, in color_deconvolution_routine
    Stains, StainsFloat, wc = color_deconvolution(im_rgb, w=W_source, I_0=None)
  File ".../python3.6/site-packages/histomicstk/preprocessing/color_deconvolution/color_deconvolution.py", line 77, in color_deconvolution
    Q = np.linalg.inv(wc)
  File "<__array_function__ internals>", line 6, in inv
  File ".../python3.6/site-packages/numpy/linalg/linalg.py", line 547, in inv
    ainv = _umath_linalg.inv(a, signature=signature, extobj=extobj)
  File ".../python3.6/site-packages/numpy/linalg/linalg.py", line 97, in _raise_linalgerror_singular
    raise LinAlgError("Singular matrix")
numpy.linalg.LinAlgError: Singular matrix

Here's W_source:

[[ 0.58290042  0.21542109 -0.0927159 ]
 [ 0.72414757  0.84884955  0.5099034 ]
 [ 0.36856124  0.48275065 -0.85522061]]

Here's W_target:

[[ 0.59929213  0.09243651 -0.00763791]
 [ 0.71776717  0.89055597  0.44791856]
 [ 0.35448446  0.44538248 -0.89404173]]

The shape of ihc_rgb is (3, 256, 256).

I would give more information, but I'm unable to consistently reproduce this error (the error occurs in the script I pasted above, but not in another script).

EDIT: To clarify, when I said the error doesn't occur in another script, I mean the above code is in a method, and this method is called in two locations. From one location everything works as expected, and from the other, I get this error.

manthey commented 4 years ago

Having a specific image where we can reproduce the error would greatly help. The specific parameters passed to deconvolution_based_normalization would do instead (probably just ihc_rgb).

sumanthratna commented 4 years ago

Unfortunately, I can't recreate the error, so I'll close this. I know I haven't updated HistomicsTK recently, and I don't think I found a solution. I'll reopen this issue if I come across it again.

sumanthratna commented 4 years ago

Yikes—I just realized I wrapped the code block in a try-except so I could continue development. I'm still unable to consistently reproduce this error, but this might help: TCGA-A7-A13E-01Z-00-DX1

TCGA-A7-A13E-01Z-00-DX1 from MoNuSeg

cooperlab commented 4 years ago

The stack trace indicates that the W_source matrix is singular. I checked the matrix you provided though and it's not singular. That is why you cannot reproduce the error.

You need to have at least two linearly independent columns in the stain matrix when calling color deconvolution.

sumanthratna commented 4 years ago

Alright, I've singled our dataset down to three images that produce this error. Here's one of them: TCGA-NH-A8F7-01A-01-TS1

If you'd like the other images, I can send them too.

I'm still trying to reproduce the error, and I'm struggling. The following produces an all-black output:

import numpy as np
from PIL import Image
from histomicstk.preprocessing.color_normalization.\
    deconvolution_based_normalization import deconvolution_based_normalization

ihc_rgb = np.array(Image.open(
    '/tmp/TCGA-NH-A8F7-01A-01-TS1.png').convert('RGB'))
W_source = np.array([
    [0.35950424,  0.35950424, -0.16814837],
    [0.80753471, 0.80753471,  0.54895428],
    [0.46759426,  0.46759426, -0.81876451]
])
W_target = np.array([
    [0.59929213,  0.09243651, -0.00763791],
    [0.71776717, 0.89055597,  0.44791856],
    [0.35448446,  0.44538248, -0.89404173]
])
stain_unmixing_routine_params = {
    'stains': ['hematoxylin', 'eosin'],
    'stain_unmixing_method': 'macenko_pca'
}
tissue_rgb_normalized = deconvolution_based_normalization(
    ihc_rgb,
    W_source=W_source,
    W_target=W_target,
    stain_unmixing_routine_params=stain_unmixing_routine_params
)
cooperlab commented 4 years ago

Does this reproduce the matrix singularity error, or is there a different error (output being black)?

sumanthratna commented 4 years ago

This script I just sent doesn't reproduce the matrix singularity error—there is a new error. I didn't create a new issue since the all-black output only happens with images that produce the matrix singularity error, so it seems like there might be a common problem.

cooperlab commented 4 years ago

So these images that produce an all black output also trigger the matrix singularity error? How are you bypassing the error to generate output?

sumanthratna commented 4 years ago

I have a script that calls a method in macenko.py for normalization. When running this script, I get the matrix singularity error for three images.

In an attempt to reproduce this issue, I created a new file with the contents I sent earlier. When I run this new file, I don't get an error, but tissue_rgb_normalized ends up as an all-black image. EDIT: it's worth noting that when I get all-black images, I get the following warnings:

.../site-packages/histomicstk/preprocessing/color_conversion/sda_to_rgb.py:34: RuntimeWarning: overflow encountered in power
  im_rgb = I_0 ** (1 - im_sda / 255.)
.../site-packages/histomicstk/preprocessing/color_conversion/rgb_to_sda.py:48: RuntimeWarning: divide by zero encountered in log
  im_sda = -np.log(im_rgb/(1.*I_0)) * 255/np.log(I_0)
cooperlab commented 4 years ago

I'm trying to work the original example with the images you provided. 'S_source' and 'S_target' are not defined.

Please provide in a single post a stand-alone working example and data that produces the error.

sumanthratna commented 4 years ago

I've been working on that but I cannot for the life of me figure out why I can't reproduce this. I'll send a stand-alone example as soon as possible, but if it helps, for now, removing the W_source=W_source fixes the issue.

The W_sources for the 3 images are singular. I'll do some more research and see what we should do in the case that W_source is singular.

cooperlab commented 4 years ago

W_source can be singular / non-invertible, but must have at least two columns that are linearly independent. In that case the third column is derived to be orthogonal to the first two so that the matrix is invertible.

Each column here represents a stain, and so that is why you need at least two (you wouldn't deconvolve an image of a tissue stained with a single stain).