AFM-SPM / TopoStats

An AFM image analysis program to batch process data and obtain statistics from images
https://afm-spm.github.io/TopoStats/
GNU Lesser General Public License v3.0
57 stars 10 forks source link

Look at replacing feret and centroid calculations in GrainStats #798

Closed ns-rse closed 5 months ago

ns-rse commented 7 months ago

Whilst working on #748 I needed to extract the Feret calculations so that I could obtain the co-ordinates for the minimum and maximum feret distances. Work in-progress on ns-rse/748-grain-profiles.

Initial implementation in #755 includes a bunch of tests as I had some issues getting what I thought were the correct answers out.

As this will be its own, tested, sub-module we can replace the functionality of GrainStats.get_min_max_feret() which currently only has a single 2x2 square tested and a regression test in place on the final values of cropped minicircle.spm. This can also reduce some duplication of code as convex hull is calculated twice, once for all points and then again for the upper and lower hulls.

I noticed also that there are calculations to determine the centroid of grains/molecules and we could look at replacing these with the Scikit Image's skimage.measure.regionprops() which returns the centroid (amongst many other things).

I started making the change to use topostats.measure.feret.min_max_feret() (currently stashed) and found that in the regression test whilst the values calculated for the max_feret were identical to those returned by GrainStats.get_min_max_feret() the min_feret differed.

Using some of the shapes in the tests developed for topostats.measure.feret to test the Gratinstats.get_max_min_ferets() revealed there may be some issues with the max_feret. These use incredibly simple tiny_{circle,square,quadrilateral} shapes as well as simple elipses and the tests fail for both min_feret and max_feret (see original test on a simple square at tests/test_grainstats.py::test_get_min_max_ferets()).

Code for these tests isn't included in #755 but follows (NB the axis, min_feret_coord_target and max_feret_coord_target parameters aren't used in the tests as GrainStats.get_max_min_ferets() only has one way of sorting convex hull edge points, but it was easier to just copy and paste them from the tests for topostats.measure.feret.min_max_feret()).

import numpy as np
import numpy.typing as npt
import pytest
from skimage import draw

tiny_circle = np.zeros((3, 3), dtype=np.uint8)
rr, cc = draw.circle_perimeter(1, 1, 1)
tiny_circle[rr, cc] = 1

small_circle = np.zeros((5, 5), dtype=np.uint8)
rr, cc = draw.circle_perimeter(2, 2, 2)
small_circle[rr, cc] = 1

tiny_quadrilateral = np.asarray(
    [
        [0, 0, 0, 0, 0, 0],
        [0, 0, 1, 0, 0, 0],
        [0, 1, 0, 0, 1, 0],
        [0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0],
        [0, 0, 1, 0, 0, 0],
        [0, 0, 0, 0, 0, 0],
    ],
    dtype=np.uint8,
)

tiny_square = np.asarray([[0, 0, 0, 0], [0, 1, 1, 0], [0, 1, 1, 0], [0, 0, 0, 0]], dtype=np.uint8)

tiny_triangle = np.asarray([[0, 0, 0, 0], [0, 1, 1, 0], [0, 1, 0, 0], [0, 0, 0, 0]], dtype=np.uint8)

tiny_rectangle = np.asarray([[0, 0, 0, 0], [0, 1, 1, 0], [0, 1, 1, 0], [0, 1, 1, 0], [0, 0, 0, 0]], dtype=np.uint8)

tiny_ellipse = np.asarray(
    [
        [0, 0, 0, 0, 0],
        [0, 0, 1, 0, 0],
        [0, 1, 0, 1, 0],
        [0, 1, 0, 1, 0],
        [0, 1, 0, 1, 0],
        [0, 0, 1, 0, 0],
        [0, 0, 0, 0, 0],
    ],
    dtype=np.uint8,
)

holo_circle = np.zeros((7, 7), dtype=np.uint8)
rr, cc = draw.circle_perimeter(3, 3, 2)
holo_circle[rr, cc] = 1

holo_ellipse_vertical = np.zeros((11, 9), dtype=np.uint8)
rr, cc = draw.ellipse_perimeter(5, 4, 4, 3)
holo_ellipse_vertical[rr, cc] = 1

holo_ellipse_horizontal = np.zeros((9, 11), dtype=np.uint8)
rr, cc = draw.ellipse_perimeter(4, 5, 3, 4)
holo_ellipse_horizontal[rr, cc] = 1

holo_ellipse_angled = np.zeros((8, 10), dtype=np.uint8)
rr, cc = draw.ellipse_perimeter(4, 5, 1, 3, orientation=np.deg2rad(30))
holo_ellipse_angled[rr, cc] = 1

curved_line = np.zeros((10, 10), dtype=np.uint8)
rr, cc = draw.bezier_curve(1, 5, 5, -2, 8, 8, 2)
curved_line[rr, cc] = 1

@pytest.mark.parametrize(
    (
        "shape",
        "axis",
        "min_feret_distance_target",
        "min_feret_coord_target",
        "max_feret_distance_target",
        "max_feret_coord_target",
    ),
    [
        pytest.param(
            tiny_circle, 0, 1.4142135623730951, ([0, 1], [1, 0]), 2, ([1, 2], [1, 0]), id="tiny circle sorted on axis 0"
        ),
        pytest.param(
            tiny_square,
            0,
            1.0,
            ([1, 1], [2, 1]),
            1.4142135623730951,
            ([1, 2], [2, 1]),
            id="tiny square sorted on axis 0",
        ),
        pytest.param(
            tiny_quadrilateral,
            0,
            3.0,
            ([2, 4], [2, 1]),
            4.0,
            ([1, 2], [5, 2]),
            id="tiny quadrilateral sorted on axis 0",
        ),
        pytest.param(
            tiny_triangle,
            0,
            1.0,
            ([1, 1], [2, 1]),
            1.4142135623730951,
            ([1, 2], [2, 1]),
            id="tiny triangle sorted on axis 0",
        ),
        pytest.param(
            tiny_rectangle,
            0,
            1.0,
            ([1, 2], [1, 1]),
            2.23606797749979,
            ([1, 2], [3, 1]),
            id="tiny rectangle sorted on axis 0",
        ),
        pytest.param(tiny_ellipse, 0, 2.0, ([2, 3], [2, 1]), 4.0, ([1, 2], [5, 2]), id="tiny ellipse sorted on axis 0"),
        pytest.param(
            small_circle,
            1,
            4.0,
            ([0, 1], [4, 1]),
            4.47213595499958,
            ([1, 4], [3, 0]),
            id="small circle sorted on axis 0",
        ),
        pytest.param(
            holo_circle, 0, 4.0, ([1, 2], [5, 2]), 4.47213595499958, ([4, 5], [2, 1]), id="holo circle sorted on axis 0"
        ),
        pytest.param(
            holo_ellipse_horizontal,
            0,
            6.0,
            ([1, 3], [7, 3]),
            8.246211251235321,
            ([5, 9], [3, 1]),
            id="holo ellipse horizontal on axis 0",
        ),
        pytest.param(
            holo_ellipse_vertical,
            0,
            6.0,
            ([3, 7], [3, 1]),
            8.246211251235321,
            ([1, 5], [9, 3]),
            id="holo ellipse vertical on axis 0",
        ),
        pytest.param(
            holo_ellipse_angled,
            0,
            3.1622776601683795,
            ([1, 4], [2, 1]),
            7.615773105863909,
            ([5, 8], [2, 1]),
            id="holo ellipse angled on axis 0",
        ),
        pytest.param(
            curved_line,
            1,
            5.656854249492381,
            ([1, 5], [5, 1]),
            8.06225774829855,
            ([4, 1], [8, 8]),
            id="curved line sorted on axis 1",
        ),
    ],
)
def test_min_max_feret_grainstats(
    shape: npt.NDArray,
    axis: int,
    min_feret_distance_target: float,
    min_feret_coord_target: list,
    max_feret_distance_target: float,
    max_feret_coord_target: list,
) -> None:
    """Test calculation of min/max feret."""
    edge_points = np.argwhere(shape == 1)
    min_feret_distance, max_feret_distance = GrainStats.get_max_min_ferets(edge_points)
    # NB Comment out the next line to view failures/differences in max_feret
    assert min_feret_distance == min_feret_distance_target
    assert max_feret_distance == max_feret_distance_target
ns-rse commented 7 months ago

Done a bit more work investigating this by adding additional test images to tests/test_grainstats.py::test_get_min_max_ferets()).

If we use a tiny_circle and tiny_triangle we have some very basic shapes we can work with and easily think about what the results should be.

tiny_circle = np.zeros((3, 3), dtype=np.uint8)
rr, cc = draw.circle_perimeter(1, 1, 1)
tiny_circle[rr, cc] = 1
tiny_circle

array([[0, 1, 0],
       [1, 0, 1],
       [0, 1, 0]], dtype=uint8)

np.argwhere(tiny_circle == 1)

array([[0, 1],
       [1, 0],
       [1, 2],
       [2, 1]])

tiny_triangle = np.asarray([[1, 1], [1, 0]], dtype=np.uint8)
tiny_triangle

np.argwhere(tiny_triangle == 1)

array([[1, 1],
       [1, 0]], dtype=uint8)

array([[0, 0],
       [0, 1],
       [1, 0]])

Just as with the square we can use basic trigonometry to work out what the minimum and maximum ferets of these should be.

Triangle

Minimum Feret : $\sqrt(1^2 + 1^2)) / 2$ 0.7071067811865476 Maximum Feret : $\sqrt(1^2 + 1^2)$ 1.4142135623730951

Circle

Minimum Feret : $\sqrt(1^2 + 1^2)$ 1.4142135623730951 Maximum Feret : 2

Testing

@pytest.mark.parametrize(
    ("edge_points", "min_expected", "max_expected"),
    [
        pytest.param([[0, 0], [0, 1], [1, 0], [1, 1]], 1.0, 1.4142135623730951, id="square"),
        pytest.param([[0, 0], [0, 1], [1, 0]], 0.7071067811865476, 1.4142135623730951, id="triangle"),
        pytest.param([[0, 1], [1, 0], [1, 2], [2, 1]], 1.4142135623730951, 2.0, id="circle"),
    ],
)
def test_get_min_max_ferets(edge_points, min_expected, max_expected) -> None:
    """Tests the GrainStats.get_min_max_ferets method."""
    min_feret, max_feret = GrainStats.get_max_min_ferets(edge_points)
    np.testing.assert_almost_equal(min_feret, min_expected)
    np.testing.assert_almost_equal(max_feret, max_expected)
cwd: /home/neil/work/git/hub/AFM-SPM/TopoStats/
cmd: pytest --color=yes 'tests/test_grainstats.py::test_get_min_max_ferets'

======================================================== test session starts ========================================================
platform linux -- Python 3.11.6, pytest-7.4.2, pluggy-1.3.0
Matplotlib: 3.8.0
Freetype: 2.6.1
rootdir: /home/neil/work/git/hub/AFM-SPM/TopoStats
configfile: pyproject.toml
plugins: mock-3.11.1, mpl-0.16.1, github-actions-annotate-failures-0.2.0, xdist-3.3.1, napari-plugin-engine-0.2.0, regtest-1.5.1, cov-4.1.0, hypothesis-6.87.1, lazy-fixture-0.6.3, anyio-4.0.0, typeguard-4.1.5
collected 3 items                                                                                                                   

tests/test_grainstats.py ...                                                                                                  [100%]

---------- coverage: platform linux, python 3.11.6-final-0 -----------
Name                                Stmts   Miss  Cover
-------------------------------------------------------
topostats/__main__.py                   2      2     0%
topostats/entry_point.py              105    105     0%
topostats/filters.py                  154    133    14%
topostats/grains.py                   105     81    23%
topostats/grainstats.py               331    235    29%
topostats/io.py                       412    348    16%
topostats/logs/logs.py                 25      0   100%
topostats/plotting.py                 156    127    19%
topostats/plottingfuncs.py            139    139     0%
topostats/processing.py               213    213     0%
topostats/run_topostats.py            109    109     0%
topostats/scars.py                    105     94    10%
topostats/statistics.py                26     26     0%
topostats/theme.py                     40      5    88%
topostats/thresholds.py                31     18    42%
topostats/tracing/dnacurvature.py      24     24     0%
topostats/tracing/dnatracing.py       418    365    13%
topostats/tracing/skeletonize.py       23     12    48%
topostats/tracing/tracingfuncs.py     613    550    10%
topostats/utils.py                     88     68    23%
topostats/validation.py                15     15     0%
-------------------------------------------------------
TOTAL                                3134   2669    15%

========================================================= 3 passed in 0.25s =========================================================

Ok this is good, the basic shapes work, however in the case of the tiny_triangle there is no point on the convex hull that is mid-way along the hypotenuse so I'm surprised that the rotating caliper algorithm gets this correct. Since its based on pairs on the upper and lower convex hull the Rotating Caliper based minimum feret should be 1.

I've looked at how min_feret is calculated in GrainStats.get_min_max_feret() and it is calculated using the current pair of points on the rotating caliper and either the next point in the lower_hull or the previous point in the upper_hull.

small_feret = GrainStats.get_triangle_height(
                    np.array(lower_hull[lower_index + 1, :]),
                    np.array(lower_hull[lower_index, :]),
                    np.array(upper_hull[upper_index, :]),
                )
if min_feret is None or small_feret < min_feret:
                    min_feret = small_feret

...and...

small_feret = GrainStats.get_triangle_height(
                    np.array(upper_hull[upper_index - 1, :]),
                    np.array(upper_hull[upper_index, :]),
                    np.array(lower_hull[lower_index, :]),
                )
if min_feret is None or small_feret < min_feret:
                    min_feret = small_feret

A useful post Minimum Feret Diameter » Steve on Image Processing with MATLAB - MATLAB & Simulink explains why this is the correct way to calculate the minimum feret (the visual diagrams were really helpful).

Thus #755 needs revising before it can be merged to use this method.

Apologies for doubting your solution @SylviaWhittle .

MaxGamill-Sheffield commented 7 months ago

Note: Necessary refactor due to bounding boxes not being returned and needed for traces