imi-bigpicture / wsidicom

Python package for reading DICOM WSI file sets.
Apache License 2.0
34 stars 7 forks source link

Created instance datasets for levels with index > 0 have `ImageType==["ORIGINAL", "PRIMARY", "VOLUME", "RESAMPLED"]` #175

Open jleuschn opened 19 hours ago

jleuschn commented 19 hours ago

When creating level instances with wsidicom.instance.dataset.WsiDataset.create_instance_dataset, it always uses ImageType[0]=="ORIGINAL", while ImageType[3]=="NONE" for level 0 and ImageType[3]=="RESAMPLED" for all other levels.

If I understand the description of Image Type in the DICOM standard correctly, resampling is considered a derivation, i.e., if ImageType[3]=="RESAMPLED", it should imply ImageType[0]=="DERIVED".

Indeed, openslide does not allow the combination "ORIGINAL" + "RESAMPLED", so levels with index > 0 of a DICOM slide created this way are not detected, see the image types allowed by openslide:

// the ImageTypes we allow for pyr levels
static const char *const ORIGINAL_TYPES[] = {
  "ORIGINAL", "PRIMARY", "VOLUME", "NONE", NULL
};
// if the image has been re-encoded during conversion to DICOM
static const char *const DERIVED_ORIGINAL_TYPES[] = {
  "DERIVED", "PRIMARY", "VOLUME", "NONE", NULL
};
static const char *const RESAMPLED_TYPES[] = {
  "DERIVED", "PRIMARY", "VOLUME", "RESAMPLED", NULL
};

Here is an example using wsidicomizer showing that these level files are not found by openslide:

import os
from openslide import OpenSlide
from pydicom import dcmread
from wsidicom import WsiDicom
from wsidicomizer import WsiDicomizer

# Prerequisite: download the CMU-1.tiff test image at `TIFF_PATH`:
# https://openslide.cs.cmu.edu/download/openslide-testdata/Generic-TIFF/CMU-1.tiff
TIFF_PATH = "CMU-1.tiff"
PATH = "CMU-1_converted"

with WsiDicomizer.open(TIFF_PATH) as wsi:
    wsi.save(PATH)

print("Image Type fields:")
print(
    "\n".join(
        f"{dcmread(os.path.join(PATH, f)).ImageType} (file {f})"
        for f in os.listdir(PATH)
        if f.endswith(".dcm")
    )
)

slide_openslide = OpenSlide(
    next(os.path.join(PATH, f) for f in os.listdir(PATH) if f.endswith(".dcm"))
)
print("Level dimensions according to openslide:", slide_openslide.level_dimensions)

slide = WsiDicom.open(PATH)
print(
    "Level dimensions according to wsidicom:",
    tuple((lvl.size.width, lvl.size.height) for lvl in slide.levels),
)

If this issue is better suited at the wsidicomizer repo, I'm happy to move it, but I think create_instance_dataset should not use ImageType[0]=="ORIGINAL" together with ImageType[3]=="RESAMPLED", which would need to be changed here in wsidicom.

erikogabrielsson commented 13 hours ago

Hi @jleuschn

Yes, it might be more reasonable to use DERIVED for RESAMPELD levels. Can you make a PR?

However, I have seen images (not produced by wsidicom/izer) that have ['ORIGINAL', 'PRIMARY', 'VOLUME', 'RESAMPLED'], see for example MoticWangjie^Professer in the test data. I have made an issue at openslide asking why they have the check.