AFM-SPM / TopoStats

An AFM image analysis program to batch process data and obtain statistics from images
https://afm-spm.github.io/TopoStats/
GNU Lesser General Public License v3.0
60 stars 11 forks source link

General Cleanup and Following the Better Tracing Merger #976

Closed MaxGamill-Sheffield closed 4 weeks ago

MaxGamill-Sheffield commented 1 month ago

This PR:

  1. Should prevent the logging warning about unclosed files popping up - this was due to the same logger trying to be setup multiple times.

  2. Close #975 by re-enabling the stats plotting with the for contour length and average_end_to_end distance as these changed.

  3. Actually removes skeletons < min_skel_size from being processed instead of just telling you and doing nothing about it, this prevents a Skan error being thrown.

  4. Close #929 by switching the logging level mostly to debug. Now there is only the following output:

    • config + loading:
      (topo_test) Max:TopoStats Maxgamill$ topostats process -c minicircle_test/mc_config.yaml 
      [Sat, 19 Oct 2024 13:02:52] [INFO    ] [topostats] The YAML configuration file is valid.
      [Sat, 19 Oct 2024 13:02:52] [INFO    ] [topostats] The YAML plotting configuration file is valid.
      [Sat, 19 Oct 2024 13:02:52] [INFO    ] [topostats] Configuration run options are consistent, processing can proceed.
      [Sat, 19 Oct 2024 13:02:52] [INFO    ] [topostats] Configuration file loaded from      : minicircle_test/mc_config.yaml
      [Sat, 19 Oct 2024 13:02:52] [INFO    ] [topostats] Scanning for images in              : tests/resources
      [Sat, 19 Oct 2024 13:02:52] [INFO    ] [topostats] Output directory                    : minicircle_test
      [Sat, 19 Oct 2024 13:02:52] [INFO    ] [topostats] Looking for images with extension   : .spm
      [Sat, 19 Oct 2024 13:02:52] [INFO    ] [topostats] Images with extension .spm in tests/resources : 1
      [Sat, 19 Oct 2024 13:02:52] [INFO    ] [topostats] Thresholding method (Filtering)     : std_dev
      [Sat, 19 Oct 2024 13:02:52] [INFO    ] [topostats] Thresholding method (Grains)        : std_dev
      [Sat, 19 Oct 2024 13:02:52] [INFO    ] [topostats] Extracting image from tests/resources/minicircle.spm
    • The running of the pipeline:
      Processing images from tests/resources, results are under minicircle_test:   0%|                                                                               | 0/1 [00:00<?, ?it/s][Sat, 19 Oct 2024 13:02:56] [INFO    ] [topostats] Processing : minicircle
      [Sat, 19 Oct 2024 13:02:56] [INFO    ] [topostats] [minicircle] : *** Filtering ***
      [Sat, 19 Oct 2024 13:02:58] [INFO    ] [topostats] [minicircle] : Plotting Filtering Images
      [Sat, 19 Oct 2024 13:02:59] [INFO    ] [topostats] [minicircle] : Filters stage completed successfully.
      [Sat, 19 Oct 2024 13:02:59] [INFO    ] [topostats] [minicircle] : *** Grain Finding ***
      [Sat, 19 Oct 2024 13:02:59] [INFO    ] [topostats] [minicircle] : Grains found for direction above : 21
      [Sat, 19 Oct 2024 13:02:59] [INFO    ] [topostats] [minicircle] : Plotting Grain Finding Images
      [Sat, 19 Oct 2024 13:02:59] [INFO    ] [topostats] [minicircle] : Grain Finding stage completed successfully.
      [Sat, 19 Oct 2024 13:02:59] [INFO    ] [topostats] [minicircle] : *** Grain Statistics ***
      [Sat, 19 Oct 2024 13:03:00] [INFO    ] [topostats] [minicircle] : Calculated grainstats for 21 grains.
      [Sat, 19 Oct 2024 13:03:00] [INFO    ] [topostats] [minicircle] : Grainstats stage completed successfully.
      [Sat, 19 Oct 2024 13:03:00] [INFO    ] [topostats] [minicircle] : *** Disordered Tracing ***
      [Sat, 19 Oct 2024 13:03:00] [INFO    ] [topostats] [minicircle] : Calculating Disordered Tracing statistics for 21 grains...
      [Sat, 19 Oct 2024 13:03:07] [INFO    ] [topostats] [minicircle] : Disordered Tracing stage completed successfully.
      [Sat, 19 Oct 2024 13:03:07] [INFO    ] [topostats] [minicircle] : *** Nodestats ***
      [Sat, 19 Oct 2024 13:03:07] [INFO    ] [topostats] [minicircle] : Calculating NodeStats statistics for 21 grains...
      [Sat, 19 Oct 2024 13:03:07] [INFO    ] [topostats] [minicircle] : NodeStats stage completed successfully.
      [Sat, 19 Oct 2024 13:03:07] [INFO    ] [topostats] [minicircle] : *** Ordered Tracing ***
      [Sat, 19 Oct 2024 13:03:07] [INFO    ] [topostats] [minicircle] : Calculating Ordered Traces and statistics for 21 grains...
      [Sat, 19 Oct 2024 13:03:09] [INFO    ] [topostats] [minicircle] : Ordered Tracing stage completed successfully.
      [Sat, 19 Oct 2024 13:03:09] [INFO    ] [topostats] [minicircle] : *** Splining ***
      [Sat, 19 Oct 2024 13:03:09] [INFO    ] [topostats] [minicircle] : Calculating Splining statistics for 25 molecules...
      [Sat, 19 Oct 2024 13:03:10] [INFO    ] [topostats] [minicircle] : Splining stage completed successfully.
      [Sat, 19 Oct 2024 13:03:10] [INFO    ] [topostats] [minicircle] : *** Image Statistics ***
      [Sat, 19 Oct 2024 13:03:10] [INFO    ] [topostats] [minicircle] : Saving image to .topostats file
      Processing images from tests/resources, results are under minicircle_test: 100%|███████████████████████████████████████████████████████████████████████| 1/1 [00:17<00:00, 17.50s/it][Sat, 19 Oct 2024 13:03:10] [INFO    ] [topostats] [minicircle] Processing completed.
    • and the output message:
      
      [Sat, 19 Oct 2024 13:03:10] [INFO    ] [topostats] Saving image stats to : minicircle_test/image_stats.csv.
      [Sat, 19 Oct 2024 13:03:10] [INFO    ] [topostats] Saving all height profiles to minicircle_test/height_profiles.json
      [Sat, 19 Oct 2024 13:03:10] [INFO    ] [topostats] The YAML summarisation config is valid.
      [Sat, 19 Oct 2024 13:03:10] [INFO    ] [topostats] [plotting] Default variable to labels mapping loaded.
      [Sat, 19 Oct 2024 13:03:10] [INFO    ] [topostats] Summary plots and statistics will be saved to : minicircle_test/summary_distributions
      [Sat, 19 Oct 2024 13:03:12] [INFO    ] [topostats] Folder-wise statistics saved to: minicircle_test/folder_grain_stats.csv
      [Sat, 19 Oct 2024 13:03:12] [INFO    ] [topostats] Folder-wise statistics saved to: minicircle_test/folder_disordered_trace_stats.csv
      [Sat, 19 Oct 2024 13:03:12] [INFO    ] [topostats] Folder-wise statistics saved to: minicircle_test/folder_mol_stats.csv
      [Sat, 19 Oct 2024 13:03:12] [INFO    ] [topostats] 

  TopoStats Version           : 2.2.2.dev904+g68868e280.d20241018
  Base Directory              : tests/resources
  File Extension              : .spm
  Files Found                 : 1
  Successfully Processed^1    : 1 (100.0%)
  All statistics              : minicircle_test/all_statistics.csv
  Distribution Plots          : minicircle_test/summary_distributions

  Configuration               : minicircle_test/config.yaml

  Email                       : topostats@sheffield.ac.uk
  Documentation               : https://afm-spm.github.io/topostats/
  Source Code                 : https://github.com/AFM-SPM/TopoStats/
  Bug Reports/Feature Request : https://github.com/AFM-SPM/TopoStats/issues/new/choose
  Citation File Format        : https://github.com/AFM-SPM/TopoStats/blob/main/CITATION.cff

  ^1 Successful processing of an image is detection of grains and calculation of at least
     grain statistics. If these have been disabled the percentage will be 0.

  If you encounter bugs/issues or have feature requests please report them at the above URL
  or email us.

  If you have found TopoStats useful please consider citing it. A Citation File Format is
  linked above and available from the Source Code page.
ns-rse commented 1 month ago

Neat, what does minicircle_test/mc_config.yaml look like now? I've tried this and still get verbose INFO, tried switching log_level: warning in a custom output but had very little output beyond the tqdm process bar.

MaxGamill-Sheffield commented 1 month ago

Neat, what does minicircle_test/mc_config.yaml look like now? I've tried this and still get verbose INFO, tried switching log_level: warning in a custom output but had very little output beyond the tqdm process bar.

I jumped the gun with the text because I had it all ready to go but wanted to make commits which shouldn't break the tests at each commit.

I found the source of the issue I mentioned this morning and now this should now be ready for review (maybe a few .topostats or .csv test files need to be added first but that should be ok as I've got them on my system).

MaxGamill-Sheffield commented 1 month ago

With the topostats/default_config.yaml I still get a lot of LOGGER.info() output...

This is very unusual as I only see the following when running a single file:

(topo_test) Max:TopoStats Maxgamill$ topostats process -c workshop/ws_config.yaml
[Tue, 22 Oct 2024 15:24:40] [INFO    ] [topostats] The YAML configuration file is valid.
[Tue, 22 Oct 2024 15:24:41] [INFO    ] [topostats] The YAML plotting configuration file is valid.
[Tue, 22 Oct 2024 15:24:41] [INFO    ] [topostats] Configuration run options are consistent, processing can proceed.
[Tue, 22 Oct 2024 15:24:41] [INFO    ] [topostats] Configuration file loaded from      : workshop/ws_config.yaml
[Tue, 22 Oct 2024 15:24:41] [INFO    ] [topostats] Scanning for images in              : workshop
[Tue, 22 Oct 2024 15:24:41] [INFO    ] [topostats] Output directory                    : workshop
[Tue, 22 Oct 2024 15:24:41] [INFO    ] [topostats] Looking for images with extension   : .spm
[Tue, 22 Oct 2024 15:24:41] [INFO    ] [topostats] Images with extension .spm in workshop : 1
[Tue, 22 Oct 2024 15:24:41] [INFO    ] [topostats] Thresholding method (Filtering)     : std_dev
[Tue, 22 Oct 2024 15:24:41] [INFO    ] [topostats] Thresholding method (Grains)        : std_dev
[Tue, 22 Oct 2024 15:24:41] [INFO    ] [topostats] Extracting image from workshop/20230823_TAF_IR_minicircle_1.5ng_CaptureFile.0_00022.spm
Processing images from workshop, results are under workshop:   0%|                                                                                             | 0/1 [00:00<?, ?it/s][Tue, 22 Oct 2024 15:24:44] [INFO    ] [topostats] Processing : 20230823_TAF_IR_minicircle_1.5ng_CaptureFile.0_00022
[Tue, 22 Oct 2024 15:24:44] [INFO    ] [topostats] [20230823_TAF_IR_minicircle_1.5ng_CaptureFile.0_00022] : *** Filtering ***
[Tue, 22 Oct 2024 15:24:44] [INFO    ] [topostats] [20230823_TAF_IR_minicircle_1.5ng_CaptureFile.0_00022] : Plotting Filtering Images
[Tue, 22 Oct 2024 15:24:45] [INFO    ] [topostats] [20230823_TAF_IR_minicircle_1.5ng_CaptureFile.0_00022] : Filters stage completed successfully.
[Tue, 22 Oct 2024 15:24:45] [INFO    ] [topostats] [20230823_TAF_IR_minicircle_1.5ng_CaptureFile.0_00022] : *** Grain Finding ***
[Tue, 22 Oct 2024 15:24:45] [INFO    ] [topostats] [20230823_TAF_IR_minicircle_1.5ng_CaptureFile.0_00022] : Grains found for direction above : 2
[Tue, 22 Oct 2024 15:24:45] [INFO    ] [topostats] [20230823_TAF_IR_minicircle_1.5ng_CaptureFile.0_00022] : Plotting Grain Finding Images
[Tue, 22 Oct 2024 15:24:46] [INFO    ] [topostats] [20230823_TAF_IR_minicircle_1.5ng_CaptureFile.0_00022] : Grain Finding stage completed successfully.
[Tue, 22 Oct 2024 15:24:46] [INFO    ] [topostats] [20230823_TAF_IR_minicircle_1.5ng_CaptureFile.0_00022] : *** Grain Statistics ***
[Tue, 22 Oct 2024 15:24:46] [INFO    ] [topostats] [20230823_TAF_IR_minicircle_1.5ng_CaptureFile.0_00022] : Calculated grainstats for 2 grains.
[Tue, 22 Oct 2024 15:24:46] [INFO    ] [topostats] [20230823_TAF_IR_minicircle_1.5ng_CaptureFile.0_00022] : Grainstats stage completed successfully.
[Tue, 22 Oct 2024 15:24:46] [INFO    ] [topostats] [20230823_TAF_IR_minicircle_1.5ng_CaptureFile.0_00022] : *** Disordered Tracing ***
[Tue, 22 Oct 2024 15:24:46] [INFO    ] [topostats] [20230823_TAF_IR_minicircle_1.5ng_CaptureFile.0_00022] : Calculating Disordered Tracing statistics for 2 grains...
[Tue, 22 Oct 2024 15:24:48] [INFO    ] [topostats] [20230823_TAF_IR_minicircle_1.5ng_CaptureFile.0_00022] : Disordered Tracing stage completed successfully.
[Tue, 22 Oct 2024 15:24:48] [INFO    ] [topostats] [20230823_TAF_IR_minicircle_1.5ng_CaptureFile.0_00022] : *** Nodestats ***
[Tue, 22 Oct 2024 15:24:48] [INFO    ] [topostats] [20230823_TAF_IR_minicircle_1.5ng_CaptureFile.0_00022] : Calculating NodeStats statistics for 2 grains...
[Tue, 22 Oct 2024 15:24:48] [INFO    ] [topostats] [20230823_TAF_IR_minicircle_1.5ng_CaptureFile.0_00022] : NodeStats stage completed successfully.
[Tue, 22 Oct 2024 15:24:48] [INFO    ] [topostats] [20230823_TAF_IR_minicircle_1.5ng_CaptureFile.0_00022] : *** Ordered Tracing ***
[Tue, 22 Oct 2024 15:24:48] [INFO    ] [topostats] [20230823_TAF_IR_minicircle_1.5ng_CaptureFile.0_00022] : Calculating Ordered Traces and statistics for 2 grains...
[Tue, 22 Oct 2024 15:24:48] [INFO    ] [topostats] [20230823_TAF_IR_minicircle_1.5ng_CaptureFile.0_00022] : Ordered Tracing stage completed successfully.
[Tue, 22 Oct 2024 15:24:48] [INFO    ] [topostats] [20230823_TAF_IR_minicircle_1.5ng_CaptureFile.0_00022] : *** Splining ***
[Tue, 22 Oct 2024 15:24:48] [INFO    ] [topostats] [20230823_TAF_IR_minicircle_1.5ng_CaptureFile.0_00022] : Calculating Splining statistics for 2 molecules...
[Tue, 22 Oct 2024 15:24:49] [INFO    ] [topostats] [20230823_TAF_IR_minicircle_1.5ng_CaptureFile.0_00022] : Splining stage completed successfully.
[Tue, 22 Oct 2024 15:24:49] [INFO    ] [topostats] [20230823_TAF_IR_minicircle_1.5ng_CaptureFile.0_00022] : *** Image Statistics ***
[Tue, 22 Oct 2024 15:24:49] [INFO    ] [topostats] [20230823_TAF_IR_minicircle_1.5ng_CaptureFile.0_00022] : Saving image to .topostats file
Processing images from workshop, results are under workshop: 100%|█████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:08<00:00,  8.04s/it][Tue, 22 Oct 2024 15:24:49] [INFO    ] [topostats] [20230823_TAF_IR_minicircle_1.5ng_CaptureFile.0_00022] Processing completed.
Processing images from workshop, results are under workshop: 100%|█████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:08<00:00,  8.04s/it]
[Tue, 22 Oct 2024 15:24:49] [INFO    ] [topostats] Saving image stats to : workshop/image_stats.csv.
[Tue, 22 Oct 2024 15:24:49] [INFO    ] [topostats] Saving all height profiles to workshop/height_profiles.json
[Tue, 22 Oct 2024 15:24:49] [INFO    ] [topostats] The YAML summarisation config is valid.
[Tue, 22 Oct 2024 15:24:49] [INFO    ] [topostats] [plotting] Default variable to labels mapping loaded.
[Tue, 22 Oct 2024 15:24:49] [INFO    ] [topostats] Summary plots and statistics will be saved to : workshop/summary_distributions
[Tue, 22 Oct 2024 15:24:50] [INFO    ] [topostats] Folder-wise statistics saved to: workshop/folder_grain_stats.csv
[Tue, 22 Oct 2024 15:24:50] [INFO    ] [topostats] Folder-wise statistics saved to: workshop/folder_disordered_trace_stats.csv
[Tue, 22 Oct 2024 15:24:51] [INFO    ] [topostats] Folder-wise statistics saved to: workshop/folder_mol_stats.csv
[Tue, 22 Oct 2024 15:24:51] [INFO    ] [topostats] 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ COMPLETE ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  TopoStats Version           : 2.2.2.dev923+gd3444cdc8
  Base Directory              : workshop
  File Extension              : .spm
  Files Found                 : 1
  Successfully Processed^1    : 1 (100.0%)
  All statistics              : workshop/all_statistics.csv
  Distribution Plots          : workshop/summary_distributions

  Configuration               : workshop/config.yaml

  Email                       : topostats@sheffield.ac.uk
  Documentation               : https://afm-spm.github.io/topostats/
  Source Code                 : https://github.com/AFM-SPM/TopoStats/
  Bug Reports/Feature Request : https://github.com/AFM-SPM/TopoStats/issues/new/choose
  Citation File Format        : https://github.com/AFM-SPM/TopoStats/blob/main/CITATION.cff

  ^1 Successful processing of an image is detection of grains and calculation of at least
     grain statistics. If these have been disabled the percentage will be 0.

  If you encounter bugs/issues or have feature requests please report them at the above URL
  or email us.

  If you have found TopoStats useful please consider citing it. A Citation File Format is
  linked above and available from the Source Code page.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

(topo_test) Max:TopoStats Maxgamill$ 

Do we want to consider setting the default log level to WARNING so that only warnings are reported perhaps?

I'd want to avoid this so that the users know that TopoStats IS working and not just stalling. I believe that seeing some progress is better than no progress. Are you able to retry this branch and double check the output for you is not the same as I see?

CUDA Warnings

Not everyone has nVidia grahpics cards (or dedicated graphics cards for that matter). My system does but I don't have the CUDA drivers installed so see this error. This might be annoying to users without and we should look to capture and suppress or replace this message with a simpler one.

No idea what this is and haven't seen it before. I looks like it might be from a Tensorflow dependancy? I could take a look into it but I don't think its from anything that has been added in this PR or the better tracing ones - this might be from the unets?

We could disable all warnings with the following but is this wise?

import warnings
warnings.filterwarnings("ignore")

topostats.utils.update_config()

Line 123 still has a LOGGER.info() which I think is responsible for the first big chunk about savefig_dpi being updated. In part this comes from appending the plotting_config.yaml to the config.yaml (as dictionaries). Do we want to tell people when configuration has been updated or not? I would guess that people will know when they are using command line option flags to update parameters and these are all written to the resulting .yaml that accompanies output anyway.

filters.py

Still has a lot of LOGGER.info() messages in. Will users for example be interested or nead to know if Median flattening with[out] mask has been performed? They would notionally know if this has been modified from the default if they have changed it in the configuration file.

Ditto for many other actions such as Plane tilt removal with[out] mask.

Perhaps filters.py was missed in the find/replace LOGGER.info > LOGGER.debug | LOGGER.warning | LOGGER.error.

grainstats.py

Not sure we need to report that Processing of grain : # and definitely reporting that Height profiles extracted. seems excessive.

Plotting

Do we need to report the plotting of all figures and changes to how plots are made? Probably not I feel.

General Comment

I know I'm the one who started liberally sprinkling LOGGER.info() everywhere early on in the refactoring so I apologise for that.

Please could you re-run these? I've checked the utils.py (at the least) and line 123 (along with all the other logging and caplog updates was updated the other day and so should have the affect of all of the above.

Broadly I think longer term it would be useful, as I've mentioned previously, to switch to loguru throughout. One useful feature, probably not unique though, is the ability to easily format the log output to indicate which module messages are coming from (which is really useful when debugging). It also only requires a single handler to be added.

Aye, Sylvia mentioned this but I wanted to provide something quick to tidy up the terminal outputs and errors and I didn't know how loguru would implement its own handlers or be linked to capslog so I thought I should leave it for now :)

P.S. - Use of keyword Close won't be picked up because its nested within unbullteted lists.

Ah nightmare, but cool, I'll tag the issues on this

ns-rse commented 1 month ago

No idea what this is and haven't seen it before. I looks like it might be from a Tensorflow dependancy? I could take a look into it but I don't think its from anything that has been added in this PR or the better tracing ones - this might be from the unets?

Yes, this is due to Tensorflow.

We could disable all warnings with the following but is this wise?

But then users wouldn't see TopoStats warnings so I'm not sure that is what is required.

A quick search led me to this solution

Putting the following in __init__.py would suppress the warnings

import os

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

Incoming PR (with small "bonus").

Are you able to retry this branch and double check the output for you is not the same as I see?

Seems I hadn't git pull'd the branch, my apologies. Much quieter output now, once the TensorFlow message supression is merged this is good to go.

loguru

I wasn't proposing overhauling things with regards to loguru here and have made some simple forays into switching on Monday but its a bit more involved so I've shelved it in favour of focusing on #517 as a priority.

I understand the need to always do things "quickly" but sometimes its a false economy spending time on a quick fix and then (maybe) going back and doing the larger piece of work that should have been done in the first place (this is in essence what accumulates as "technical debt"). We don't always get things right first time, and I know I'm continually learning better ways of doing things. Taking ownership of work and doing it to the best of our ability helps minimise the accumulation of technical debt.

MaxGamill-Sheffield commented 1 month ago

Putting the following in __init__.py would suppress the warnings

import os

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

Cool I've added this now! @llwiggins have you tested this yet and is it good to go?

ns-rse commented 1 month ago

It was in the PR I created for this branch, I've just fixed the failing test (I tried to keep output clean by switching from LOGGER.info() to print in the completion message but the tests look for lines in log so I've reverted it).

ns-rse commented 4 weeks ago

There is now duplicate lines for disabling Tensor Flow.

I will remove one.