AFM-SPM / TopoStats

An AFM image analysis program to batch process data and obtain statistics from images
https://afm-spm.github.io/TopoStats/
GNU Lesser General Public License v3.0
56 stars 10 forks source link

Different processing success rates with different numbers of cores #594

Open derollins opened 1 year ago

derollins commented 1 year ago

Checklist

Please try and tick off each of these items when filing the bug report. There are further instructions on each below.

Describe the bug

When run on 6 cores 50 out of 69 files from a particular dataset are processed successfully however when run on 3 cores 67 images are successfully processed with the other config parameters remaining unchanged. I also ran the same dataset and config with 3 cores on a different PC and got 67/69 successes on that.

Copy of the output

6 cores:

[Thu, 01 Jun 2023 14:08:41] [INFO    ] [topostats] Folder-wise statistics saved to: C:\Users\NanoLab\Desktop\TopoStats-User-Data\Eddie\DNA_Cas9\outputabs1_65\Cas9DNA\ONTarget_SC/folder_grainstats.csv
[Thu, 01 Jun 2023 14:08:41] [INFO    ] [topostats]

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ COMPLETE ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  TopoStats Version           : 2.1.1.dev47+g1d02cbc7
  Base Directory              : C:\Users\NanoLab\Desktop\TopoStats-User-Data\Eddie\DNA_Cas9
  File Extension              : .spm
  Files Found                 : 69
  Successfully Processed^1    : 50 (72.46376811594203%)
  Configuration               : C:\Users\NanoLab\Desktop\TopoStats-User-Data\Eddie\DNA_Cas9\outputabs1_65/config.yaml
  All statistics              : C:\Users\NanoLab\Desktop\TopoStats-User-Data\Eddie\DNA_Cas9\outputabs1_65/all_statistics.csv
  Distribution Plots          : C:\Users\NanoLab\Desktop\TopoStats-User-Data\Eddie\DNA_Cas9\outputabs1_65\summary_distributions

  Email                       : topostats@sheffield.ac.uk
  Documentation               : https://afm-spm.github.io/topostats/
  Source Code                 : https://github.com/AFM-SPM/TopoStats/
  Bug Reports/Feature Request : https://github.com/AFM-SPM/TopoStats/issues/new/choose
  Citation File Format        : https://github.com/AFM-SPM/TopoStats/blob/main/CITATION.cff

  ^1 Successful processing of an image is detection of grains and calculation of at least
     grain statistics. If these have been disabled the percentage will be 0.

  If you encounter bugs/issues or have feature requests please report them at the above URL
  or email us.

  If you have found TopoStats useful please consider citing it. A Citation File Format is
  linked above and available from the Source Code page.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

3 cores:

[Thu, 01 Jun 2023 14:36:00] [INFO    ] [topostats] Folder-wise statistics saved to: C:\Users\NanoLab\Desktop\TopoStats-User-Data\Eddie\DNA_Cas9\outputabs1_65_3cores\Minicircles\SC_OFF/folder_grainstats.csv
[Thu, 01 Jun 2023 14:36:00] [INFO    ] [topostats] Folder-wise statistics saved to: C:\Users\NanoLab\Desktop\TopoStats-User-Data\Eddie\DNA_Cas9\outputabs1_65_3cores\Cas9DNA\ONTarget_SC/folder_grainstats.csv
[Thu, 01 Jun 2023 14:36:00] [INFO    ] [topostats]

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ COMPLETE ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  TopoStats Version           : 2.1.1.dev47+g1d02cbc7
  Base Directory              : C:\Users\NanoLab\Desktop\TopoStats-User-Data\Eddie\DNA_Cas9
  File Extension              : .spm
  Files Found                 : 69
  Successfully Processed^1    : 67 (97.10144927536231%)
  Configuration               : C:\Users\NanoLab\Desktop\TopoStats-User-Data\Eddie\DNA_Cas9\outputabs1_65_3cores/config.yaml
  All statistics              : C:\Users\NanoLab\Desktop\TopoStats-User-Data\Eddie\DNA_Cas9\outputabs1_65_3cores/all_statistics.csv
  Distribution Plots          : C:\Users\NanoLab\Desktop\TopoStats-User-Data\Eddie\DNA_Cas9\outputabs1_65_3cores\summary_distributions

  Email                       : topostats@sheffield.ac.uk
  Documentation               : https://afm-spm.github.io/topostats/
  Source Code                 : https://github.com/AFM-SPM/TopoStats/
  Bug Reports/Feature Request : https://github.com/AFM-SPM/TopoStats/issues/new/choose
  Citation File Format        : https://github.com/AFM-SPM/TopoStats/blob/main/CITATION.cff

  ^1 Successful processing of an image is detection of grains and calculation of at least
     grain statistics. If these have been disabled the percentage will be 0.

  If you encounter bugs/issues or have feature requests please report them at the above URL
  or email us.

  If you have found TopoStats useful please consider citing it. A Citation File Format is
  linked above and available from the Source Code page.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

My laptop 3 cores:

[Thu, 01 Jun 2023 07:55:58] [INFO    ] [topostats] Folder-wise statistics saved to: C:\Users\Work\OneDrive\Documents\Uni\Research\Data\Projects\Cas9_Minicircles\Analysis\DNA_Cas9\outputabs1_65\Cas9DNA\OFFTarget_SC/folder_grainstats.csv
[Thu, 01 Jun 2023 07:55:58] [INFO    ] [topostats]

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ COMPLETE ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  TopoStats Version           : 2.1.1.dev10+g9762b08.d20230526
  Base Directory              : C:\Users\Work\OneDrive\Documents\Uni\Research\Data\Projects\Cas9_Minicircles\Analysis\DN  File Extension              : .spm
  Files Found                 : 69
  Successfully Processed^1    : 67 (97.10144927536231%)
  Configuration               : C:\Users\Work\OneDrive\Documents\Uni\Research\Data\Projects\Cas9_Minicircles\Analysis\DNA_Cas9\outputabs1_65/config.yaml
  All statistics              : C:\Users\Work\OneDrive\Documents\Uni\Research\Data\Projects\Cas9_Minicircles\Analysis\DNA_Cas9\outputabs1_65/all_statistics.csv
  Distribution Plots          : C:\Users\Work\OneDrive\Documents\Uni\Research\Data\Projects\Cas9_Minicircles\Analysis\DNA_Cas9\outputabs1_65\summary_distributions

  Email                       : topostats@sheffield.ac.uk
  Documentation               : https://afm-spm.github.io/topostats/
  Source Code                 : https://github.com/AFM-SPM/TopoStats/
  Bug Reports/Feature Request : https://github.com/AFM-SPM/TopoStats/issues/new/choose
  Citation File Format        : https://github.com/AFM-SPM/TopoStats/blob/main/CITATION.cff

  ^1 Successful processing of an image is detection of grains and calculation of at least
     grain statistics. If these have been disabled the percentage will be 0.

  If you encounter bugs/issues or have feature requests please report them at the above URL
  or email us.

  If you have found TopoStats useful please consider citing it. A Citation File Format is
  linked above and available from the Source Code page.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Include the configuration file

If no configuration file was specified with the -c/--config-file option the defaults were used, please use the run_topostats --create-config-file crash.yaml to save these to the crash.yaml file and copy the contents below.

6 cores

# Configuration from TopoStats run completed : 2023-06-01 14:08:41
# For more information on configuration and how to use it:
# https://afm-spm.github.io/TopoStats/main/configuration.html
base_dir: C:\Users\NanoLab\Desktop\TopoStats-User-Data\Eddie\DNA_Cas9
output_dir: C:\Users\NanoLab\Desktop\TopoStats-User-Data\Eddie\DNA_Cas9\outputabs1_65
log_level: info
cores: 6
file_ext: .spm
loading:
  channel: Height
filter:
  run: true
  row_alignment_quantile: 0.5
  threshold_method: std_dev
  otsu_threshold_multiplier: 1.0
  threshold_std_dev:
    below: 10.0
    above: 1.0
  threshold_absolute:
    below: -1.0
    above: 1.0
  gaussian_size: 1.0121397464510862
  gaussian_mode: nearest
  remove_scars:
    run: true
    removal_iterations: 2
    threshold_low: 0.25
    threshold_high: 0.666
    max_scar_width: 4
    min_scar_length: 16
grains:
  run: true
  threshold_method: absolute
  otsu_threshold_multiplier: 1.0
  threshold_std_dev:
    below: 10.0
    above: 1.0
  threshold_absolute:
    below: -1.0
    above: 1.65
  direction: above
  smallest_grain_size_nm2: 50
  absolute_area_threshold:
    above:
    - 100
    - 500
    below:
    - 
    - 
grainstats:
  run: true
  edge_detection_method: binary_erosion
  cropped_size: 40.0
dnatracing:
  run: true
  min_skeleton_size: 10
plotting:
  run: true
  save_format: png
  pixel_interpolation:
  image_set: all
  zrange:
  - -4
  - 4
  colorbar: true
  axes: true
  cmap: nanoscope
  mask_cmap: blu
  histogram_log_axis: false
  histogram_bins: 200
  dpi: 1000
summary_stats:
  run: true
  config:

3 cores:

# Configuration from TopoStats run completed : 2023-06-01 14:36:00
# For more information on configuration and how to use it:
# https://afm-spm.github.io/TopoStats/main/configuration.html
base_dir: C:\Users\NanoLab\Desktop\TopoStats-User-Data\Eddie\DNA_Cas9
output_dir: C:\Users\NanoLab\Desktop\TopoStats-User-Data\Eddie\DNA_Cas9\outputabs1_65_3cores
log_level: info
cores: 3
file_ext: .spm
loading:
  channel: Height
filter:
  run: true
  row_alignment_quantile: 0.5
  threshold_method: std_dev
  otsu_threshold_multiplier: 1.0
  threshold_std_dev:
    below: 10.0
    above: 1.0
  threshold_absolute:
    below: -1.0
    above: 1.0
  gaussian_size: 1.0121397464510862
  gaussian_mode: nearest
  remove_scars:
    run: true
    removal_iterations: 2
    threshold_low: 0.25
    threshold_high: 0.666
    max_scar_width: 4
    min_scar_length: 16
grains:
  run: true
  threshold_method: absolute
  otsu_threshold_multiplier: 1.0
  threshold_std_dev:
    below: 10.0
    above: 1.0
  threshold_absolute:
    below: -1.0
    above: 1.65
  direction: above
  smallest_grain_size_nm2: 50
  absolute_area_threshold:
    above:
    - 100
    - 500
    below:
    - 
    - 
grainstats:
  run: true
  edge_detection_method: binary_erosion
  cropped_size: 40.0
dnatracing:
  run: true
  min_skeleton_size: 10
plotting:
  run: true
  save_format: png
  pixel_interpolation:
  image_set: core
  zrange:
  - -4
  - 4
  colorbar: true
  axes: true
  cmap: nanoscope
  mask_cmap: blu
  histogram_log_axis: false
  histogram_bins: 200
  dpi: 1000
summary_stats:
  run: true
  config:

My laptop 3 cores:

# Configuration from TopoStats run completed : 2023-06-01 07:55:58
# For more information on configuration and how to use it:
# https://afm-spm.github.io/TopoStats/main/configuration.html
base_dir: C:\Users\Work\OneDrive\Documents\Uni\Research\Data\Projects\Cas9_Minicircles\Analysis\DNA_Cas9
output_dir: C:\Users\Work\OneDrive\Documents\Uni\Research\Data\Projects\Cas9_Minicircles\Analysis\DNA_Cas9\outputabs1_65
log_level: info
cores: 3
file_ext: .spm
loading:
  channel: Height
filter:
  run: true
  row_alignment_quantile: 0.5
  threshold_method: std_dev
  otsu_threshold_multiplier: 1.0
  threshold_std_dev:
    below: 10.0
    above: 1.0
  threshold_absolute:
    below: -1.0
    above: 1.0
  gaussian_size: 1.0121397464510862
  gaussian_mode: nearest
  remove_scars:
    run: true
    removal_iterations: 2
    threshold_low: 0.25
    threshold_high: 0.666
    max_scar_width: 4
    min_scar_length: 16
grains:
  run: true
  threshold_method: absolute
  otsu_threshold_multiplier: 1.0
  threshold_std_dev:
    below: 10.0
    above: 1.0
  threshold_absolute:
    below: -1.0
    above: 1.65
  direction: above
  smallest_grain_size_nm2: 50
  absolute_area_threshold:
    above:
    - 100
    - 500
    below:
    - 
    - 
grainstats:
  run: true
  edge_detection_method: binary_erosion
  cropped_size: 40.0
dnatracing:
  run: true
  min_skeleton_size: 10
plotting:
  run: true
  save_format: png
  pixel_interpolation:
  image_set: all
  zrange:
  - -4
  - 4
  colorbar: true
  axes: true
  cmap: nanoscope
  mask_cmap: blu
  histogram_log_axis: false
  histogram_bins: 200
  dpi: 1000
summary_stats:
  run: true
  config:

To Reproduce

If it is possible to share the file (e.g. via cloud services) that caused the error that would greatly assist in reproducing and investigating the problem. In addition the exact command used that failed should be pasted below.

Data is unpublished- contact me if required.

run_topostats -c DNA_Cas9_config_v3.yaml

Output

Any output files that have been produced can be attached to this bug report (by default they will be under the output directory unless you have customised the configuration). 6cores: DNA_Cas9-2023-06-01-12-07-48.log DNA_Cas9-2023-06-01-12-07-55.log

3cores:

DNA_Cas9-2023-06-01-14-20-23.log DNA_Cas9-2023-06-01-14-20-30.log

My laptop 3 cores: DNA_Cas9-2023-06-01-01-09-04.log DNA_Cas9-2023-06-01-01-09-15.log

TopoStats version

Please report the version of TopoStats you are using. There are several ways of doing this, either with pip or run_topostats. Please copy and paste all output from either of the following commands.

Installed version of TopoStats : 2.1.1.dev47+g1d02cbc7

Operating System and Python Version

Operating System

Please let us know what operating system you are using, if you have used more than one then tick all boxes.

Python Version

Please let us know the version of Python you are using, paste the results of python --version

Python 3.10.9

Optional : Python Packages

If you are able to provide a list of your installed packages that may be useful. The best way to get this is to copy and paste the results of typing pip freeze.

certifi @ file:///C:/b/abs_85o_6fm0se/croot/certifi_1671487778835/work/certifi
colorama==0.4.6
contextlib2==21.6.0
contourpy==1.0.7
cycler==0.11.0
fonttools==4.38.0
igor==0.3
imageio==2.25.0
joblib==1.2.0
kiwisolver==1.4.4
matplotlib==3.6.3
networkx==3.0
numpy==1.23.4
packaging==23.0
pandas==1.5.3
Pillow==9.4.0
pyfiglet==0.8.post1
pyparsing==3.0.9
pySPM==0.2.23
python-dateutil==2.8.2
pytz==2022.7.1
PyWavelets==1.4.1
PyYAML==6.0
ruamel.yaml==0.17.21
ruamel.yaml.clib==0.2.7
schema==0.7.5
scikit-image==0.19.2
scikit-learn==1.2.1
scipy==1.10.0
seaborn==0.12.2
six==1.16.0
threadpoolctl==3.1.0
tifffile==2023.2.3
topostats @ file:///C:/Users/NanoLab/Documents/TopoStats%20%28Development%29/TopoStats
tqdm==4.64.1
wincertstore==0.2

Additional context

Add any other context about the problem here.

derollins commented 1 year ago

Update:

When run on nanocharacterisation lab analysis PC (PC that isn't my laptop above) on 3 cores with image set: all, 50 images out of 69 successfully processed.

Config file:

# Sample configuration file auto-generated : 2023-05-26 09:29:17
# For more information on configuration : https://afm-spm.github.io/TopoStats/main/configuration.html
base_dir: C:\Users\NanoLab\Desktop\TopoStats-User-Data\Eddie\DNA_Cas9
output_dir: C:\Users\NanoLab\Desktop\TopoStats-User-Data\Eddie\DNA_Cas9\outputabs1_65_3cores
log_level: info
cores: 3
file_ext: .spm
loading:
  channel: Height
filter:
  run: true
  row_alignment_quantile: 0.5
  threshold_method: std_dev
  otsu_threshold_multiplier: 1.0
  threshold_std_dev:
    below: 10.0
    above: 1.0
  threshold_absolute:
    below: -1.0
    above: 1.0
  gaussian_size: 1.0121397464510862
  gaussian_mode: nearest
  remove_scars:
    run: true
    removal_iterations: 2
    threshold_low: 0.25
    threshold_high: 0.666
    max_scar_width: 4
    min_scar_length: 16
grains:
  run: true
  threshold_method: absolute
  otsu_threshold_multiplier: 1.0
  threshold_std_dev:
    below: 10.0
    above: 1.0
  threshold_absolute:
    below: -1.0
    above: 1.65
  direction: above
  smallest_grain_size_nm2: 50
  absolute_area_threshold:
    above:
    - 100
    - 500
    below:
    - 
    - 
grainstats:
  run: true
  edge_detection_method: binary_erosion
  cropped_size: 40.0
dnatracing:
  run: true
  min_skeleton_size: 10
plotting:
  run: true
  save_format: png
  pixel_interpolation:
  image_set: all
  zrange:
  - -4
  - 4
  colorbar: true
  axes: true
  cmap: nanoscope
  mask_cmap: blu
  histogram_log_axis: false
  histogram_bins: 200
  dpi: 1000
summary_stats:
  run: true
  config:

Log files: DNA_Cas9-2023-06-01-14-49-39.log DNA_Cas9-2023-06-01-14-49-44.log

Everything else the same.

derollins commented 1 year ago

Another update:

When run on my laptop on 1 cores with image set: all, 67 images out of 69 successfully processed.

Config:

# Configuration from TopoStats run completed : 2023-06-02 00:23:29
# For more information on configuration and how to use it:
# https://afm-spm.github.io/TopoStats/main/configuration.html
base_dir: C:\Users\Work\OneDrive\Documents\Uni\Research\Data\Projects\Cas9_Minicircles\Analysis\DNA_Cas9
output_dir: C:\Users\Work\OneDrive\Documents\Uni\Research\Data\Projects\Cas9_Minicircles\Analysis\DNA_Cas9\outputabs1_65_1core
log_level: info
cores: 1
file_ext: .spm
loading:
  channel: Height
filter:
  run: true
  row_alignment_quantile: 0.5
  threshold_method: std_dev
  otsu_threshold_multiplier: 1.0
  threshold_std_dev:
    below: 10.0
    above: 1.0
  threshold_absolute:
    below: -1.0
    above: 1.0
  gaussian_size: 1.0121397464510862
  gaussian_mode: nearest
  remove_scars:
    run: true
    removal_iterations: 2
    threshold_low: 0.25
    threshold_high: 0.666
    max_scar_width: 4
    min_scar_length: 16
grains:
  run: true
  threshold_method: absolute
  otsu_threshold_multiplier: 1.0
  threshold_std_dev:
    below: 10.0
    above: 1.0
  threshold_absolute:
    below: -1.0
    above: 1.65
  direction: above
  smallest_grain_size_nm2: 50
  absolute_area_threshold:
    above:
    - 100
    - 500
    below:
    - 
    - 
grainstats:
  run: true
  edge_detection_method: binary_erosion
  cropped_size: 40.0
dnatracing:
  run: true
  min_skeleton_size: 10
plotting:
  run: true
  save_format: png
  pixel_interpolation:
  image_set: core
  zrange:
  - -4
  - 4
  colorbar: true
  axes: true
  cmap: nanoscope
  mask_cmap: blu
  histogram_log_axis: false
  histogram_bins: 200
  dpi: 1000
summary_stats:
  run: true
  config:

Log files: DNA_Cas9-2023-06-01-23-41-14.log DNA_Cas9-2023-06-01-23-41-25.log