clemsgrs / hs2p

Histopathology Slides Preprocessing Pipeline
30 stars 7 forks source link

Ensuring Consistency in Two Consecutive Patches Distance #16

Closed bryanwong17 closed 1 month ago

bryanwong17 commented 1 month ago

Hi, I am trying to extract C16 raw patches using hs2p with below configurations:

csv: 'camelyon16.csv' # path to the .csv / .txt file containing slides paths

output_dir: '/vast/WSI_datasets/camelyon16' # folder where to save algorithm output
experiment_name: 'patch_extraction'
resume: false # whether or not to resume existing experiment
resume_id:

backend: 'openslide' # which image backend should be used when opening whole slide images (chose among 'openslide', 'pyvips' or 'asap')

flags:
  patch: true # whether or not to extract patches from segmented tissue regions
  visu: true # whether or not to generate a .jpg image to visualize patching results
  verbose: false

seg_params:
  downsample: 64 # if seg_level = -1, then uses this value to find the closest downsample level in the WSI for tissue segmentation computation
  seg_level: -1 # updated seg_level
  sthresh: 15 # updated segmentation threshold
  mthresh: 11 # updated median filter size
  close: 2 # updated morphological closing
  use_otsu: false # updated use_otsu method
  save_mask: false # save tissue mask to disk as a .tif image
  visualize_mask: true # save a visualization of the tissue mask as a .jpg image
  tissue_pixel_value: 1 # value of tissue pixel in pre-computed segmentation masks

filter_params:
  ref_patch_size: 512 # reference patch size at spacing patch_params.spacing
  a_t: 1 # updated area filter threshold for tissue
  a_h: 1 # updated area filter threshold for holes
  max_n_holes: 2 # updated maximum number of holes to consider per detected foreground contours

vis_params:
  downsample: 64 # if vis_level = -1, then uses this value to find the closest downsample level in the WSI for tissue segmentation visualization
  vis_level: -1 # updated vis_level
  downscale: 64 # downsample to visualize the result of patch extraction
  line_thickness: 50 # updated line thickness to draw the segmentation results (positive integer)

patch_params:
  spacing: 0.25 # pixel spacing (in micron/pixel) at which patches should be extracted (will find the level with spacing the closest to this value)
  patch_size: 512 # updated patch size
  overlap: 0. # percentage of overlap between two consecutive patches (float between 0 and 1)
  use_padding: true # updated whether to pad the border of the slide
  contour_fn: 'pct' # updated contour checking function to decide whether a patch should be considered foreground or background
  tissue_thresh: 0.5 # if contour_fn = 'pct', threshold used to filter out patches that have less tissue than this value (percentage)
  drop_holes: false # whether or not to drop patches whose center pixel falls withing an identified holes
  save_patches_to_disk: true # whether or not to save patches as images to disk
  save_patches_in_common_dir: false # whether to save patches from different slides in a single common directory
  save_npy: false # whether to save patch info in a .npy file
  format: 'jpg' # if save_patches_to_disk = true, then saves patches in this file format
  draw_grid: true # whether to draw the patch grid when visualizing patching results
  grid_thickness: 1 # sets the grid thickness ((in px) when visualizing patching results (256: 1, 4096: 2)
  bg_color: # which (r,g,b) values should be used to represent background when visualizing patching results
    - 214
    - 233
    - 238

speed:
  multiprocessing: true
  num_workers: 10 # number of process to start in parallel

wandb:
  enable: false
  project: 'hs2p'
  exp_name: '${experiment_name}'
  username: 'clemsg'
  dir: '/home/user'
  group:
  tags: []

I assumed that with this setting, the distance between pixels in patches would always be 512 in all WSIs. However, after checking the patching results, the distance between patches is not consistently 512; within a single WSI, the distance can vary between 512, 256, and 64. Could you provide tips on how to ensure that the distance between patches in all WSIs is consistently 512?

bryanwong17 commented 1 month ago

@clemsgrs Hi, do you have any idea what might be causing the problem? Thanks in advance!

clemsgrs commented 1 month ago

hi, given you set visu: true in your config file, could you show me the results of patching for a slide where you have inconsistent distances between patches?

bryanwong17 commented 1 month ago

Hi @clemsgrs, I attached test_003 (Camelyon16) visualization using above configurations. As can be seen, there are 72576 and 72640 for the x coordinates (the distance should be every 512) ![test_003](

KakaoTalk_20240717_185954372

https://github.com/user-attachments/assets/41af07ca-4ea0-4a05-b45b-bcb4bb248d07)

clemsgrs commented 1 month ago

That's expected: if you look at the corresponding tissue mask, you will see that multiple tissue blobs were detected. When such thing happens, each blob gets tiled independently. This can result in either slightly overlapping tiles, or tiles that are not strictly 512 distant. See attached zoom taken from your image. image

bryanwong17 commented 1 month ago

Thanks for pointing out the problem! Do you think it would be fine if I extract it this way?

clemsgrs commented 1 month ago

That's hard to say. In my opinion it's not an issue if it doesn't happen too often. You could try to tune the tissue segmentation parameters to get tissue masks with fewer "detached" blobs. Or provide pre-computed tissue masks too.

bryanwong17 commented 1 month ago

Alright, thanks for your opinion! I will close this for now. Thank you for helping!