AFM-SPM / TopoStats

An AFM image analysis program to batch process data and obtain statistics from images
https://afm-spm.github.io/TopoStats/
GNU Lesser General Public License v3.0
56 stars 10 forks source link

Dnatracing fails when one molecule spline fails #696

Open SylviaWhittle opened 11 months ago

SylviaWhittle commented 11 months ago

The issue: It appears that there may be a little bit more work to be done on the removing loops from dnatracing.py. There is an error where in get_splined_traces(), where if a ValueError is raised on line 370, then self.mol_is_circular.pop(dna_num) is called. This is code from when the dnaTrace class acted on the whole image (all molecules) and not a single molecule. Hence, dna_num is undefined, self.mol_is_circular is a bool not a list and .pop() is therefore not defined for self.mol_is_circular.

image

There are lines below it that will be affected similarly. I tried to have a go at fixing this but it seems like it will need a bit of an in-depth consideration as I'm not familiar with the control flow surrounding this, and how to represent a failure state properly without breaking the data structures.

Then again, I believe this fix might not be necessary if I tidy up and address the comments on #653 Improve Linear Splining, as that removes this block of code entirely. Thoughts would be appreciated @ns-rse 😄

I'd really like to get this fixed ASAP as this is preventing me from fixing the test test_run_dnatracing() in test_processing.py which I need working to be able to add tests for #695 🙏

Version: Current main branch: 2.1.3.dev2+g6354f5477.d20231012 Config file: Default Processing file: minicircle.spm Command: pytest tests/test_processing.py::test_run_dnatracing (With run_dnatracing() in processing.py set to raise errors, not ignore them) Python Version: 3.10 OS: MacOS

Output:

self = <topostats.tracing.dnatracing.dnaTrace object at 0x163b3cd00>

    def get_splined_traces(self):
        """Gets a splined version of the fitted trace - useful for finding the radius of gyration etc

        This function actually calculates the average of several splines which is important for getting a good fit on
        the lower res data"""

        step_size = int(7e-9 / (self.pixel_to_nm_scaling))  # 3 nm step size
        interp_step = int(1e-10 / self.pixel_to_nm_scaling)
        # Lets see if we just got with the pixel_to_nm_scaling
        # step_size = self.pixel_to_nm_scaling
        # interp_step = self.pixel_to_nm_scaling

        self.splining_success = True
        nbr = len(self.fitted_trace[:, 0])

        # Hard to believe but some traces have less than 4 coordinates in total
        if len(self.fitted_trace[:, 1]) < 4:
            self.splined_trace = self.fitted_trace
            # continue

        # The degree of spline fit used is 3 so there cannot be less than 3 points in the splined trace
        while nbr / step_size < 4:
            if step_size <= 1:
                step_size = 1
                break
            step_size = -1
        if self.mol_is_circular:
            # if nbr/step_size > 4: #the degree of spline fit is 3 so there cannot be less than 3 points in splined trace

            # ev_array = np.linspace(0, 1, nbr * step_size)
            ev_array = np.linspace(0, 1, int(nbr * step_size))

            for i in range(step_size):
                x_sampled = np.array(
                    [self.fitted_trace[:, 0][j] for j in range(i, len(self.fitted_trace[:, 0]), step_size)]
                )
                y_sampled = np.array(
                    [self.fitted_trace[:, 1][j] for j in range(i, len(self.fitted_trace[:, 1]), step_size)]
                )

                try:
                    tck, u = interp.splprep([x_sampled, y_sampled], s=0, per=2, quiet=1, k=3)
                    out = interp.splev(ev_array, tck)
                    splined_trace = np.column_stack((out[0], out[1]))
                except ValueError:
                    # Value error occurs when the "trace fitting" really messes up the traces

                    x = np.array(
                        [self.ordered_trace[:, 0][j] for j in range(i, len(self.ordered_trace[:, 0]), step_size)]
                    )
                    y = np.array(
                        [self.ordered_trace[:, 1][j] for j in range(i, len(self.ordered_trace[:, 1]), step_size)]
                    )

                    try:
                        tck, u = interp.splprep([x, y], s=0, per=2, quiet=1)
                        out = interp.splev(np.linspace(0, 1, nbr * step_size), tck)
                        splined_trace = np.column_stack((out[0], out[1]))
                    except (
                        ValueError
                    ):  # sometimes even the ordered_traces are too bugged out so just delete these traces
>                       self.mol_is_circular.pop(dna_num)
E                       AttributeError: 'bool' object has no attribute 'pop'

topostats/tracing/dnatracing.py:456: AttributeError
-------------------------------------------------------------------------------------------------------------------------------------- Captured stdout call ---------------------------------------------------------------------------------------------------------------------------------------
[Fri, 13 Oct 2023 00:05:37] [INFO    ] [topostats] [dummy filename] : *** DNA Tracing ***
[Fri, 13 Oct 2023 00:05:37] [INFO    ] [topostats] [dummy filename] : Calculating statistics for 6 grains.
[Fri, 13 Oct 2023 00:05:37] [INFO    ] [topostats] [dummy filename] [0] : Gaussian filter applied.
[Fri, 13 Oct 2023 00:05:37] [INFO    ] [topostats] [dummy filename] [0] : Skeletonising using topostats method.
[Fri, 13 Oct 2023 00:05:37] [INFO    ] [topostats] [dummy filename] : Traced grain 1 of 6
[Fri, 13 Oct 2023 00:05:37] [INFO    ] [topostats] [dummy filename] [1] : Gaussian filter applied.
[Fri, 13 Oct 2023 00:05:37] [INFO    ] [topostats] [dummy filename] [1] : Skeletonising using topostats method.
[Fri, 13 Oct 2023 00:05:37] [INFO    ] [topostats] [dummy filename] [1] : Grain skeleton pixels < 10
[Fri, 13 Oct 2023 00:05:37] [INFO    ] [topostats] [dummy filename] : Traced grain 2 of 6
[Fri, 13 Oct 2023 00:05:37] [INFO    ] [topostats] [dummy filename] [2] : Gaussian filter applied.
[Fri, 13 Oct 2023 00:05:37] [INFO    ] [topostats] [dummy filename] [2] : Skeletonising using topostats method.
[Fri, 13 Oct 2023 00:05:37] [INFO    ] [topostats] [dummy filename] [2] : Grain skeleton pixels < 10
[Fri, 13 Oct 2023 00:05:37] [INFO    ] [topostats] [dummy filename] : Traced grain 3 of 6
[Fri, 13 Oct 2023 00:05:37] [INFO    ] [topostats] [dummy filename] [3] : Gaussian filter applied.
[Fri, 13 Oct 2023 00:05:37] [INFO    ] [topostats] [dummy filename] [3] : Skeletonising using topostats method.
[Fri, 13 Oct 2023 00:05:37] [INFO    ] [topostats] [dummy filename] : Traced grain 4 of 6
[Fri, 13 Oct 2023 00:05:37] [INFO    ] [topostats] [dummy filename] [4] : Gaussian filter applied.
[Fri, 13 Oct 2023 00:05:37] [INFO    ] [topostats] [dummy filename] [4] : Skeletonising using topostats method.
[Fri, 13 Oct 2023 00:05:37] [INFO    ] [topostats] [dummy filename] : Traced grain 5 of 6
[Fri, 13 Oct 2023 00:05:37] [INFO    ] [topostats] [dummy filename] [5] : Gaussian filter applied.
[Fri, 13 Oct 2023 00:05:37] [INFO    ] [topostats] [dummy filename] [5] : Skeletonising using topostats method.
[Fri, 13 Oct 2023 00:05:37] [INFO    ] [topostats] [dummy filename] [5] : Grain skeleton pixels < 10
[Fri, 13 Oct 2023 00:05:37] [INFO    ] [topostats] [dummy filename] : Traced grain 6 of 6
[Fri, 13 Oct 2023 00:05:37] [INFO    ] [topostats] [dummy filename] : Saving trace json data.
[Fri, 13 Oct 2023 00:05:37] [INFO    ] [topostats] [dummy filename] : Saving trace heights to JSON file.
[Fri, 13 Oct 2023 00:05:37] [INFO    ] [topostats] [dummy filename] : Saving trace cumulative distances to JSON file.
[Fri, 13 Oct 2023 00:05:40] [INFO    ] [topostats] [dummy filename_above_traced] : Image saved to : /private/var/folders/sr/wjtfqr9s6x3bw1s647t649x80000gn/T/pytest-of-sylvi/pytest-6/test_run_dnatracing0/dummy filename_above_traced.png | DPI: 800
[Fri, 13 Oct 2023 00:05:40] [INFO    ] [topostats] [dummy filename] : Calculating statistics for 7 grains.
[Fri, 13 Oct 2023 00:05:40] [INFO    ] [topostats] [dummy filename] [0] : Gaussian filter applied.
[Fri, 13 Oct 2023 00:05:40] [INFO    ] [topostats] [dummy filename] [0] : Skeletonising using topostats method.
[Fri, 13 Oct 2023 00:05:40] [INFO    ] [topostats] [dummy filename] [0] : Grain failed to skeletonise.
[Fri, 13 Oct 2023 00:05:40] [INFO    ] [topostats] [dummy filename] : Grain failed to Skeletonise
[Fri, 13 Oct 2023 00:05:40] [INFO    ] [topostats] [dummy filename] : Traced grain 1 of 7
[Fri, 13 Oct 2023 00:05:40] [INFO    ] [topostats] [dummy filename] [1] : Gaussian filter applied.
[Fri, 13 Oct 2023 00:05:40] [INFO    ] [topostats] [dummy filename] [1] : Skeletonising using topostats method.
[Fri, 13 Oct 2023 00:05:40] [WARNING ] [topostats] [dummy filename] : Errors occurred whilst calculating DNA tracing statistics, returning grain statistics
-------------------------------------------------------------------------------------------------------------------------------------- Captured stderr call ---------------------------------------------------------------------------------------------------------------------------------------
INFO:topostats:[dummy filename] : *** DNA Tracing ***
INFO:topostats:[dummy filename] : Calculating statistics for 6 grains.
INFO:topostats:[dummy filename] [0] : Gaussian filter applied.
INFO:topostats:[dummy filename] [0] : Skeletonising using topostats method.
INFO:topostats:[dummy filename] : Traced grain 1 of 6
INFO:topostats:[dummy filename] [1] : Gaussian filter applied.
INFO:topostats:[dummy filename] [1] : Skeletonising using topostats method.
INFO:topostats:[dummy filename] [1] : Grain skeleton pixels < 10
INFO:topostats:[dummy filename] : Traced grain 2 of 6
INFO:topostats:[dummy filename] [2] : Gaussian filter applied.
INFO:topostats:[dummy filename] [2] : Skeletonising using topostats method.
INFO:topostats:[dummy filename] [2] : Grain skeleton pixels < 10
INFO:topostats:[dummy filename] : Traced grain 3 of 6
INFO:topostats:[dummy filename] [3] : Gaussian filter applied.
INFO:topostats:[dummy filename] [3] : Skeletonising using topostats method.
INFO:topostats:[dummy filename] : Traced grain 4 of 6
INFO:topostats:[dummy filename] [4] : Gaussian filter applied.
INFO:topostats:[dummy filename] [4] : Skeletonising using topostats method.
INFO:topostats:[dummy filename] : Traced grain 5 of 6
INFO:topostats:[dummy filename] [5] : Gaussian filter applied.
INFO:topostats:[dummy filename] [5] : Skeletonising using topostats method.
INFO:topostats:[dummy filename] [5] : Grain skeleton pixels < 10
INFO:topostats:[dummy filename] : Traced grain 6 of 6
INFO:topostats:[dummy filename] : Saving trace json data.
INFO:topostats:[dummy filename] : Saving trace heights to JSON file.
INFO:topostats:[dummy filename] : Saving trace cumulative distances to JSON file.
INFO:topostats:[dummy filename_above_traced] : Image saved to : /private/var/folders/sr/wjtfqr9s6x3bw1s647t649x80000gn/T/pytest-of-sylvi/pytest-6/test_run_dnatracing0/dummy filename_above_traced.png | DPI: 800
INFO:topostats:[dummy filename] : Calculating statistics for 7 grains.
INFO:topostats:[dummy filename] [0] : Gaussian filter applied.
INFO:topostats:[dummy filename] [0] : Skeletonising using topostats method.
INFO:topostats:[dummy filename] [0] : Grain failed to skeletonise.
INFO:topostats:[dummy filename] : Grain failed to Skeletonise
INFO:topostats:[dummy filename] : Traced grain 1 of 7
INFO:topostats:[dummy filename] [1] : Gaussian filter applied.
INFO:topostats:[dummy filename] [1] : Skeletonising using topostats method.
WARNING:topostats:[dummy filename] : Errors occurred whilst calculating DNA tracing statistics, returning grain statistics
---------------------------------------------------------------------------------------------------------------------------------------- Captured log call ----------------------------------------------------------------------------------------------------------------------------------------
INFO     topostats:processing.py:406 [dummy filename] : *** DNA Tracing ***
INFO     topostats:dnatracing.py:815 [dummy filename] : Calculating statistics for 6 grains.
INFO     topostats:dnatracing.py:221 [dummy filename] [0] : Gaussian filter applied.
INFO     topostats:dnatracing.py:234 [dummy filename] [0] : Skeletonising using topostats method.
INFO     topostats:dnatracing.py:831 [dummy filename] : Traced grain 1 of 6
INFO     topostats:dnatracing.py:221 [dummy filename] [1] : Gaussian filter applied.
INFO     topostats:dnatracing.py:234 [dummy filename] [1] : Skeletonising using topostats method.
INFO     topostats:dnatracing.py:136 [dummy filename] [1] : Grain skeleton pixels < 10
INFO     topostats:dnatracing.py:831 [dummy filename] : Traced grain 2 of 6
INFO     topostats:dnatracing.py:221 [dummy filename] [2] : Gaussian filter applied.
INFO     topostats:dnatracing.py:234 [dummy filename] [2] : Skeletonising using topostats method.
INFO     topostats:dnatracing.py:136 [dummy filename] [2] : Grain skeleton pixels < 10
INFO     topostats:dnatracing.py:831 [dummy filename] : Traced grain 3 of 6
INFO     topostats:dnatracing.py:221 [dummy filename] [3] : Gaussian filter applied.
INFO     topostats:dnatracing.py:234 [dummy filename] [3] : Skeletonising using topostats method.
INFO     topostats:dnatracing.py:831 [dummy filename] : Traced grain 4 of 6
INFO     topostats:dnatracing.py:221 [dummy filename] [4] : Gaussian filter applied.
INFO     topostats:dnatracing.py:234 [dummy filename] [4] : Skeletonising using topostats method.
INFO     topostats:dnatracing.py:831 [dummy filename] : Traced grain 5 of 6
INFO     topostats:dnatracing.py:221 [dummy filename] [5] : Gaussian filter applied.
INFO     topostats:dnatracing.py:234 [dummy filename] [5] : Skeletonising using topostats method.
INFO     topostats:dnatracing.py:136 [dummy filename] [5] : Grain skeleton pixels < 10
INFO     topostats:dnatracing.py:831 [dummy filename] : Traced grain 6 of 6
INFO     topostats:processing.py:428 [dummy filename] : Saving trace json data.
INFO     topostats:processing.py:431 [dummy filename] : Saving trace heights to JSON file.
INFO     topostats:processing.py:441 [dummy filename] : Saving trace cumulative distances to JSON file.
INFO     topostats:plottingfuncs.py:227 [dummy filename_above_traced] : Image saved to : /private/var/folders/sr/wjtfqr9s6x3bw1s647t649x80000gn/T/pytest-of-sylvi/pytest-6/test_run_dnatracing0/dummy filename_above_traced.png | DPI: 800
INFO     topostats:dnatracing.py:815 [dummy filename] : Calculating statistics for 7 grains.
INFO     topostats:dnatracing.py:221 [dummy filename] [0] : Gaussian filter applied.
INFO     topostats:dnatracing.py:234 [dummy filename] [0] : Skeletonising using topostats method.
INFO     topostats:dnatracing.py:254 [dummy filename] [0] : Grain failed to skeletonise.
INFO     topostats:dnatracing.py:122 [dummy filename] : Grain failed to Skeletonise
INFO     topostats:dnatracing.py:831 [dummy filename] : Traced grain 1 of 7
INFO     topostats:dnatracing.py:221 [dummy filename] [1] : Gaussian filter applied.
INFO     topostats:dnatracing.py:234 [dummy filename] [1] : Skeletonising using topostats method.
WARNING  topostats:processing.py:502 [dummy filename] : Errors occurred whilst calculating DNA tracing statistics, returning grain statistics
====================================================================================================================================== fixture duration top =======================================================================================================================================
total          name                num avg            min           
0:00:00.019326 process_scan_config   1 0:00:00.019326 0:00:00.019326
0:00:00.020824         grand total   3 0:00:00.001447 0:00:00.000051
===================================================================================================================================== test call duration top ======================================================================================================================================
total          name                num avg            min           
0:00:03.557830 test_run_dnatracing   1 0:00:03.557830 0:00:03.557830
0:00:03.557830         grand total   1 0:00:03.557830 0:00:03.557830
===================================================================================================================================== test setup duration top =====================================================================================================================================
total          name                num avg            min           
0:00:00.021175 test_run_dnatracing   1 0:00:00.021175 0:00:00.021175
0:00:00.021175         grand total   1 0:00:00.021175 0:00:00.021175
=================================================================================================================================== test teardown duration top ====================================================================================================================================
total          name                num avg            min           
0:00:00.000214         grand total   1 0:00:00.000214 0:00:00.000214

---------- coverage: platform darwin, python 3.10.9-final-0 ----------
Name                                Stmts   Miss  Cover
-------------------------------------------------------
topostats/__main__.py                   2      2     0%
topostats/entry_point.py               79     79     0%
topostats/filters.py                  154    133    14%
topostats/grains.py                   105     81    23%
topostats/grainstats.py               331    281    15%
topostats/io.py                       386    320    17%
topostats/logs/logs.py                 25      0   100%
topostats/plotting.py                 156    127    19%
topostats/plottingfuncs.py            115     41    64%
topostats/processing.py               229    169    26%
topostats/run_topostats.py            113    113     0%
topostats/scars.py                    105     94    10%
topostats/statistics.py                26     20    23%
topostats/theme.py                     40     11    72%
topostats/thresholds.py                31     18    42%
topostats/tracing/dnacurvature.py      24     24     0%
topostats/tracing/dnatracing.py       435    162    63%
topostats/tracing/skeletonize.py       23     12    48%
topostats/tracing/tracingfuncs.py     613    231    62%
topostats/utils.py                     76     58    24%
topostats/validation.py                15     15     0%
-------------------------------------------------------
TOTAL                                3083   1991    35%

===================================================================================================================================== short test summary info =====================================================================================================================================
FAILED tests/test_processing.py::test_run_dnatracing - AttributeError: 'bool' object has no attribute 'pop'
ns-rse commented 11 months ago

On first look at this I think all of the *.pop() can be removed.

They were pop() because each of the classes properties was a list (perhaps dictionary, can't remember but suspect list) as the class was iterating through each of the grains, but if a ValueError was encountered then it failed to trace and so needed removing.

I think, but would have to dig deeper and check, that these attributes (self.mol_is_circular / self.disordered_trace / self.grain / self.ordered_trace) could remain as is and we just set self.splining_success = False here. (I suspect that is too simple a solution though and its more complicated!).

ns-rse commented 11 months ago

Also, if removing these lines works, it would be worth writing (yet another) test to cover this try: ... except:.

SylviaWhittle commented 11 months ago

Thanks @ns-rse

Do you have thoughts about perhaps merging (once ready) #653 to enable us to delete this block of code?

Also, removing the .pop doesn't appear to work as a new error is thrown: the variable spline_running_total undefined.

I have just realised though that I don't think spline_running_total is ever defined...

ns-rse commented 5 months ago

Possible duplicate of #366

MaxGamill-Sheffield commented 5 months ago

@ns-rse These lines aren't in the code anymore so is this ok to be closed? Doesn't seem like a duplicate as this is one specific cause of #366 whereas #366 is surrounding the try/catch.

ns-rse commented 5 months ago

@MaxGamill-Sheffield Agreed on closer inspection this is a possible cause of #366.

The broad try:...except: is still in place on the main branch in topostats.processing.py and when it fails grain statistics are returned which means users will at least get something even if tracing fails (note however that you have disabled this try: ... except: on branch maxgamill-sheffield/800-better-processing see line 400 and line 480 which I only know about because PyLint complained about the use of """...""" to comment a section as it made it look like the return ... wasn't at the end of the function, better to use # for commenting regions).

I've not looked at this in a long time so can't say right now what to do, however, with the desire to get all of the revised tracing/pruning/detangling into the main branch so that papers can be published and new versions released I would be inclined to stay focused on that aspect for now and come back to this once that work is completed.

Removing the broad try: ...except:, which was thrown in because of a desire to have something working quickly if I remember correctly, could perhaps be the next Milestone for TopoStats and would encapsulate this issue.

I personally don't have much of a problem with issues being noted, they are at least then documented and can be dealt with in order of priority but I am wary of assuming things are fixed without checking and don't have masses of capacity to investigate this myself right now given the above priority and the height profile work.