[STORY] Analysis the .svs file using QuPath.

anzhao commented 1 month ago

Metadata

anzhao commented 1 month ago

The patient ID and other information is under the label of associated images under the Image tab in QuPath. We can use Optical Character Recognition (OCR) library such as Keras-OCR to extract the info we needed.

anzhao commented 1 month ago

The complete workflow of auto-extract the Patient ID and other info from the label of associated images under the Image tab within a svs file:

1. Extract the Slide Label Image:

import openslide

# Open the .svs file
slide = openslide.OpenSlide('an.svs')

# Extract the slide label image
label_image = slide.associated_images['label']

# Save the label image for the next step Optical Character Recognition (OCR) processing
label_image.save('patient_id.png')

2. Perform OCR on the Extracted Image

#!/usr/bin/env python3
import keras_ocr

# Create the pipeline
pipeline = keras_ocr.pipeline.Pipeline()

# Read the image
images = [keras_ocr.tools.read('patient_id.png')]

# Perform OCR and Recognize text in images
prediction_groups = pipeline.recognize(images)

# Print the recognized text
for predictions in prediction_groups:
    for text, box in predictions:
        print(text)

anzhao commented 1 month ago

Australian-Imaging-Service / pipelines

[STORY] Analysis the .svs file using QuPath. #328

Metadata

The single slide image produce six DICOMs, when I open them using QuPath, it all display correctly.