hnu-digihealth / patho-pix

GNU General Public License v3.0
2 stars 0 forks source link

feat: added jupyter demo #17

Closed muellerdo closed 1 month ago

muellerdo commented 1 month ago

I'm writing a jupyter notebook for the UKSH healthcare hackathon in which we demonstrate patho-pix, our framework for whole slide image preprocessing for pathology. Write a introduction for our demonstration notebook ChatGPT Introduction to Patho-Pix: Whole Slide Image Preprocessing for Pathology

Welcome to the UKSH Healthcare Hackathon! We are excited to present Patho-Pix, a cutting-edge framework designed to streamline the preprocessing of whole slide images (WSIs) in pathology. This Jupyter notebook will guide you through the key features and functionalities of Patho-Pix, demonstrating its potential to enhance the efficiency and accuracy of digital pathology workflows. Why Whole Slide Imaging?

Whole slide imaging (WSI) represents a significant advancement in pathology, allowing for the digitization of entire histological slides at high resolution. This technology facilitates remote diagnostics, educational initiatives, and computational pathology research. However, the sheer size and complexity of WSIs pose substantial challenges for data management, processing, and analysis. The Need for Preprocessing

Effective preprocessing of WSIs is crucial for several reasons:

Quality Enhancement: Enhancing image quality by correcting artifacts, normalizing stains, and balancing colors ensures more reliable analyses.
Data Reduction: Reducing the data size through techniques like tiling and compression allows for more manageable storage and faster processing.
Feature Extraction: Identifying and isolating regions of interest (ROIs) facilitates targeted analyses and reduces computational load.
Standardization: Consistent preprocessing across different slides and studies ensures reproducibility and comparability of results.

Introducing Patho-Pix

Patho-Pix is designed to address these challenges by providing a comprehensive suite of tools for WSI preprocessing. Our framework includes functionalities such as:

Artifact Removal: Automatically detect and correct common artifacts in pathology slides.
Stain Normalization: Standardize staining across slides to minimize variability and enhance visual consistency.
Image Tiling: Divide large WSIs into smaller, more manageable tiles for focused analysis.
ROI Detection: Automatically identify and extract regions of interest for further examination.
Compression and Storage Optimization: Efficiently compress and store WSIs without compromising quality.

Demonstration Overview

In this notebook, we will walk you through the core capabilities of Patho-Pix, showcasing its application to sample WSIs. You will see how our framework can transform raw slide images into preprocessed data ready for analysis. The demonstration will cover the following steps:

Loading and Visualizing WSIs: Importing whole slide images and displaying them for initial inspection.
Artifact Detection and Correction: Applying algorithms to detect and remove artifacts from the images.
Stain Normalization: Standardizing the color profiles of the slides to ensure uniformity.
Tiling and ROI Extraction: Segmenting the WSIs into tiles and extracting regions of interest for detailed analysis.
Compression and Storage: Compressing the processed images for efficient storage and retrieval.

By the end of this demonstration, you will have a clear understanding of how Patho-Pix can be integrated into digital pathology workflows to enhance the preprocessing of whole slide images, ultimately leading to more accurate and efficient diagnostic and research outcomes.

Let's get started! parse to markdown ChatGPT

markdown

Introduction to Patho-Pix: Whole Slide Image Preprocessing for Pathology

Welcome to the UKSH Healthcare Hackathon! We are excited to present Patho-Pix, a cutting-edge framework designed to streamline the preprocessing of whole slide images (WSIs) in pathology. This Jupyter notebook will guide you through the key features and functionalities of Patho-Pix, demonstrating its potential to enhance the efficiency and accuracy of digital pathology workflows.

Why Whole Slide Imaging?

Whole slide imaging (WSI) represents a significant advancement in pathology, allowing for the digitization of entire histological slides at high resolution. This technology facilitates remote diagnostics, educational initiatives, and computational pathology research. However, the sheer size and complexity of WSIs pose substantial challenges for data management, processing, and analysis.

The Need for Preprocessing

Effective preprocessing of WSIs is crucial for several reasons:

Introducing Patho-Pix

Patho-Pix is designed to address these challenges by providing a comprehensive suite of tools for WSI preprocessing. Our framework includes functionalities such as:

Demonstration Overview

In this notebook, we will walk you through the core capabilities of Patho-Pix, showcasing its application to sample WSIs. You will see how our framework can transform raw slide images into preprocessed data ready for analysis. The demonstration will cover the following steps:

  1. Loading and Visualizing WSIs: Importing whole slide images and displaying them for initial inspection.
  2. Tiling and ROI Extraction: Segmenting the WSIs into tiles and extracting regions of interest based on tissue masks for detailed analysis.
  3. Stain Normalization: Standardizing the color profiles of the slides to ensure uniformity.

By the end of this demonstration, you will have a clear understanding of how Patho-Pix can be integrated into digital pathology workflows to enhance the preprocessing of whole slide images, ultimately leading to more accurate and efficient diagnostic and research outcomes.

Let's get started!

#download images
url_img = "http://glioblastoma.alleninstitute.org/cgi-bin/imageservice?path=" + \
           "/external/gbm/prod0/0534338971/0534338971.aff&mime=1&fileout=100125374_2." + \
           "jpg&zoom=9&top=20608&left=55168&width=15040&height=18048"
url_mask = "http://glioblastoma.alleninstitute.org/cgi-bin/imageservice?path=" + \
          "/external/gbm/prod0/0534338827/0534338827_annotation.aff&mime=1" + \
          "&fileout=100122048_1.jpg&zoom=9&top=20224&left=57888&width=15040&height=18048"

# self wegmachen & indent
      # Create temporary directory for dummy data
      self.tmp_data = tempfile.TemporaryDirectory(prefix="tmp.patho-pix.")
      # Download dummy image
      print("Downloading dummy image")
      response = requests.get(url_img)
      if response.status_code == 200:
          self.path_img = os.path.join(self.tmp_data.name, "image.jpg")
          with open(self.path_img, "wb") as fd:
              fd.write(response.content)
      # Download dummy mask
      print("Downloading dummy mask")
      response = requests.get(url_mask)
      if response.status_code == 200:
          self.path_mask = os.path.join(self.tmp_data.name, "mask.jpg")
          with open(self.path_mask, "wb") as fd:
              fd.write(response.content)
      # convert to tiff
      convert_jpeg_to_tiff(self.path_img, self.path_img.replace(".jpg", ".tiff"))
      convert_jpeg_to_tiff(self.path_mask, self.path_mask.replace(".jpg", ".tiff"))
      self.path_img = self.path_img.replace(".jpg", ".tiff")
      self.path_mask = self.path_mask.replace(".jpg", ".tiff")

tile_dir = tempfile.TemporaryDirectory(prefix="tmp.patho-pix.")
        wsi = load_wsi(self.path_img, tile_dir.name)

tile_dir = tempfile.TemporaryDirectory(prefix="tmp.patho-pix.")
        mask = load_mask(self.path_mask, tile_dir.name)

# thumbnail
wsi.thumbnail

mask.thumbnail

# preview tiling
    wsi_tiler = AwesomeTiler(
        tile_size=(1024, 1024),
        check_tissue=True,
        tissue_percent=10.0,
        prefix="patho-fix.",
        suffix=".png",
    )
wsi_tiler.locate_tiles(wsi)

# run tiling
metadata = tile_wsi_mask(wsi, mask)

# metadata oder so?
df = pd.DataFrame.from_dict(metadata, orient="index", columns=["percentage_tissue"])
print(df)

More chatgpt:

1. Loading and Visualizing WSIs

In this chapter, we will cover the process of importing whole slide images (WSIs) into our Patho-Pix framework and displaying them for initial inspection. This step is crucial as it allows us to visually assess the quality and characteristics of the slides before applying any preprocessing techniques. We will demonstrate how to load WSIs from various file formats and explore basic visualization tools to navigate and examine the high-resolution images.

2. Tiling and ROI Extraction

In this chapter, we will delve into the process of segmenting whole slide images (WSIs) into smaller, manageable tiles and extracting regions of interest (ROIs). Tiling is essential for handling the large size of WSIs, making it easier to focus on specific areas for detailed analysis. We will demonstrate how Patho-Pix automates the tiling process and efficiently identifies and extracts ROIs, ensuring that critical pathological features are highlighted for further examination.

codecov-commenter commented 1 month ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 97.68%. Comparing base (437dcec) to head (c3893c6).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #17 +/- ## ======================================= Coverage 97.68% 97.68% ======================================= Files 7 7 Lines 216 216 ======================================= Hits 211 211 Misses 5 5 ``` | [Flag](https://app.codecov.io/gh/hnu-digihealth/patho-pix/pull/17/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=hnu-digihealth) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/hnu-digihealth/patho-pix/pull/17/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=hnu-digihealth) | `97.68% <100.00%> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=hnu-digihealth#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.