imi-bigpicture / wsidicomizer

Python library for converting WSI files to DICOM
Apache License 2.0
54 stars 7 forks source link

Connection between tile_size, NumberOfFrames, Rows/Columns and TotalPixelMatrixRows/TotalPixelMatrixColumns #82

Closed fhnaumann closed 1 year ago

fhnaumann commented 1 year ago

I do not understand how the DICOM tags

are set. I would expect that every tile from every level has its own frame, however that does not seem to be case from my understanding. An example: I'm using the CMU-1JP2K-33005.svs as input with a tile_size of 512. The Rows/Columns value is set to 240/240 which is probably caused by scaling. So far so good. The TotalPixelMatrixRows/TotalPixelMatrixColumns obviously varies between the levels/SOPInstances. For the first (most detailed) level it's 46000/32893. From my understanding it should just be NumberOfFrames * Rows/Columns (no optical paths or focal paths present in the file), which would, for the first level with 26496 frames, equate to 6359040, but it's set to much less. Another point is the NumberOfFrames itself. Following the example above, it's set to 26496, but if I use pydicom's generate_pixel_data_frame(dataset.PixelData) I get 13248, exactly half as many. Although this may also just be a bug in the pydicom method, because for levels with only 1 frame, the method says it has 0 frames. Using DICOMweb's frames option works fine for every frame up to 26496.

erikogabrielsson commented 1 year ago

Hi @wand555

For svs the tile size argument is ignored and the tile size in the image is used instead, in order to provide (simler) lossless conversion.

NumberOfFrames (assuming fully tiled) should be equal to the number of tiles in a row * number of tiles in a column:

number_of_frames = dataset.NumberOfFrames
image_size_rows = dataset.TotalPixelMatrixRows
image_size_columns = dataset.TotalPixelMatrixColumns
tile_size_rows = dataset.Rows
tile_size_columns = dataset.Columns
number_of_tiles_row = math.ceil(image_size_rows /float(tile_size_rows))
number_of_tiles_columns = math.ceil(image_size_columns / float(tile_size_columns))

print(number_of_frames)
print(number_of_tiles_columns*number_of_tiles_row)
26496
26496

For me pydicom's generate_pixel_data_frame results in the expected number of frames:

from pydicom import dcmread
from pydicom.encaps import generate_pixel_data_frame
with dcmread(output + r'\1.2.826.0.1.3680043.8.498.11757449005436707302747969083076759976.dcm') as dataset:
    frame_count = len(list(generate_pixel_data_frame(dataset.PixelData)))
    print(frame_count)
26496
fhnaumann commented 1 year ago

Hey (renamed account, previously named wand555), thanks for your response (and sorry for the late answer).

I had an error in my testing code with pydicom's methods leading to the frames being consumed twice, hence I always got half the actual frame number.

Reading through the documentation of the different formats in OpenSlide again I also found the parts stating fixed image sizes, etc. what you also mentioned.