amspath / libisyntax

BSD 2-Clause "Simplified" License
12 stars 7 forks source link

Is origin_offset_in_pixels correct? #17

Open jonasteuwen opened 1 year ago

jonasteuwen commented 1 year ago

Following up here since #14 is about something else. Using @Falcury's tool and my function to create a read_region function:

Note: This does not happen in SlideScape! Not even a minor shift. Would this be due to the calculate of the shift in um? I will implement a non-integer shift using bilinear interpolation to check

isyntax_error_t libisyntax_read_region(isyntax_t* isyntax, isyntax_cache_t* isyntax_cache, int32_t level,
                                       int64_t x, int64_t y, int64_t width, int64_t height, uint32_t** out_pixels) {

    // Get the level
    assert(level < &isyntax->images[0].level_count);
    isyntax_level_t* current_level = &isyntax->images[0].levels[level];

    int32_t level_count = isyntax->images[0].level_count;
    // Setup the origin offset
    int32_t offset = current_level->origin_offset_in_pixels;
    int PER_LEVEL_PADDING = 3;

    // int32_t offset = ((PER_LEVEL_PADDING << level_count) - PER_LEVEL_PADDING) >> level;
    x += offset + level;
    y += offset + level;

    // Check bounds
//    assert(x + width - offset <= current_level->width_in_pixels);
//    assert(y + height - offset <= current_level->height_in_pixels);

    int32_t tile_width = isyntax->tile_width;
    int32_t tile_height = isyntax->tile_height;

    int64_t start_tile_x = x / tile_width;
    int64_t end_tile_x = (x + width - 1) / tile_width;
    int64_t start_tile_y = y / tile_height;
    int64_t end_tile_y = (y + height - 1) / tile_height;

    // Allocate memory for region
    *out_pixels = (uint32_t*)malloc(width * height * sizeof(uint32_t));

    // Read tiles and copy the relevant portion of each tile to the region
    for (int64_t tile_y = start_tile_y; tile_y <= end_tile_y; ++tile_y) {
        for (int64_t tile_x = start_tile_x; tile_x <= end_tile_x; ++tile_x) {
            // Calculate the portion of the tile to be copied
            int64_t src_x = (tile_x == start_tile_x) ? x % tile_width : 0;
            int64_t src_y = (tile_y == start_tile_y) ? y % tile_height : 0;
            int64_t dest_x = (tile_x == start_tile_x) ? 0 : (tile_x - start_tile_x) * tile_width - (x % tile_width);
            int64_t dest_y = (tile_y == start_tile_y) ? 0 : (tile_y - start_tile_y) * tile_height -
                                                            (y % tile_height);
            int64_t copy_width = (tile_x == end_tile_x) ? (x + width) % tile_width : tile_width - src_x;
            int64_t copy_height = (tile_y == end_tile_y) ? (y + height) % tile_height : tile_height - src_y;

            uint32_t *pixels = NULL;

            // Check if tile exists, if not, don't use the function to read the tile and immediately return an empty
            // tile.
            int64_t tile_index = tile_y * current_level->width_in_tiles + tile_x;
            bool tile_exists = (isyntax->images[0].levels[level].tiles + tile_index)->exists;
            if (tile_exists) {
                // Read tile
                assert(libisyntax_tile_read(isyntax, isyntax_cache, level, tile_x, tile_y, &pixels) == LIBISYNTAX_OK);
                // Copy the relevant portion of the tile to the region
                for (int64_t i = 0; i < copy_height; ++i) {
                    memcpy((*out_pixels) + (dest_y + i) * width + dest_x,
                           pixels + (src_y + i) * tile_width + src_x,
                           copy_width * sizeof(uint32_t));
                }
            } else {
                // Fill up with transparent white pixels (R=255, G=255, B=255, A=0)
                for (int64_t i = 0; i < copy_height; ++i) {
                    for (int64_t j = 0; j < copy_width; ++j) {
                        (*out_pixels)[(dest_y + i) * width + dest_x + j] = 0x00FFFFFFu;
                    }
                }
            }

            // Free the tile data
            free(pixels);
        }
    }

    return LIBISYNTAX_OK;
}

Using the origin_offset_in_pixels as defined in the header, I get a big shift between level 3 and 4 (and beyond, 0, 1, 2 are fine).

image image

If I use ((PER_LEVEL_PADDING << level_count) - PER_LEVEL_PADDING) >> level it's rather stable, but there are very minor visual shifts. You also have these shifts in level 0, 1, 2.

This is the conversion tool. If --cache-size is too small it tends to segfault, which I couldn't locate yet (I will run Valgrind, but that doesn't work on M1, so needs to wait a bit), the conversion is here: https://github.com/NKI-AI/libisyntax/blob/add-read-region/src/isyntax_to_tiff.c

The absolute offset compared to the Philips SDK is slightly annoying, because we are annotating in a platform that uses the SDK and they output absolute coordinates in level 0.

jonasteuwen commented 1 year ago

I've tried to implement bilinear interpolation, but in SlideScape it's still slowly shifting to the left across levels:

It doesn't jump around at a specific level anymore, so probably that's an implementation thing on my end given it's just a multiplication with 0.25 (or mpp). Could be that I made a bit of a mistake in the interpolation, but at least you know I'm investigating this :-)

SNIP

jonasteuwen commented 1 year ago

Let me give an example also why I think the offset is as in the Philips description (padding_var x 2^number_of_levels) - padding_var, so 1533 for 9 levels.

image

This is level 1, size 4000x4000 (resized to 1000x1000) with no offset: level_1_no_offset You can clearly see the zero-padded edge here.

This is level 1, size 4000x4000 (resized to 1000x1000) with offset 1533/2 = 766.5: level_1_correct_offset

Edge is gone, to get a better idea, here is the same level but with offset 741.5: level_1_smaller_offset

We can do the same in level 0, but it's a bit more tricky since need a large crop, but here it is. No offset, size 8000x8000, resized to 1000x1000: level_0_no_offset

The zero-padding is there. Now, using 1533.0: level_0_correct_offset

Making it slightly smaller, 1500 to see the edge: level_0_smaller_offset You need to zoom in, but you see the edge.

While this crops the zero-padded white space, there is still a shift across levels when converted to tiff #18. Maybe this is a coincidence but I in two cases I see the white space being removed this way.

jonasteuwen commented 1 year ago

This seems to keep it stable across TIFF levels. But I don't know why.

int32_t PER_LEVEL_PADDING = 3;
float offset = (float)((PER_LEVEL_PADDING << isyntax->images[0].level_count) - PER_LEVEL_PADDING) / current_level->downsample_factor;

offset -= 1.5f;

float x_float = (float)x + offset;
float y_float = (float)y + offset;

This does seem correct. If you don’t do the interpolation you will see very minor movement when passing through the levels. We might want to accept that given the interpolation is quite expensive? Might not be the best idea considering cell detection and such

Falcury commented 1 year ago

I'd like to help test this, once I have the TIFF converter utility up and running.

@jonasteuwen I think we reading the Philips specification a bit differently though. I think you're not supposed to use the formula with the total number of levels, just the level you want to calculate the offset for. Have you tried this?

int32_t offset;
if (level == 0) {
    offset = 0;
} else {
    offset = (3 << (level - 1)) - 3;
}

I'm not sure about the absolute offset, I'd have to look into this with some test files. I don't have access to SlideScore so I also can't test that.

jonasteuwen commented 1 year ago

@Falcury if it helps I can give you access to SlideScore, but you might have the SDK somewhere, then you wouldn’t need it?

Indeed, I have tested that, but you keep a rather large zero padded level 0. So how I understand it is that the DWT will erode the image slightly starting at level 0 which keeps eating a bit of the image if you proceed (well half of that). So the formula defines the offset at level 0.

The formula I use has a direct effect on cropping that in the few images I tested. The one you use shifts it slightly across levels, what I have seen, but I’ll create a few example! Is there a way to make set a location and screenshot in SlideScape so I can export them consistently? Perhaps with the console?

The only thing that remains is a fixed offset of 2048 with the three images I tested. I am trying to find something in the header that represents that. It’s worth noting though that the offset effectively crops the border, but the border is also slightly outside of the glass slide, so they might have set it that way.

Falcury commented 1 year ago

This seems to keep it stable across TIFF levels. But I don't know why.

int32_t PER_LEVEL_PADDING = 3;
float offset = (float)((PER_LEVEL_PADDING << isyntax->images[0].level_count) - PER_LEVEL_PADDING) / current_level->downsample_factor;

offset -= 1.5f;

float x_float = (float)x + offset;
float y_float = (float)y + offset;

Interesting, the 1.5 pixel thing reminds be a bit of what I did in Slidescape up until recently: https://github.com/amspath/slidescape/blob/604248f0b1f7e699db94b770c086d4a5d6b6af9b/src/isyntax/isyntax.c#L2579-L2593

jonasteuwen commented 1 year ago

Yes, it’s slightly odd, but likely a mathematical consequence of the wavelet transform used (this is definitely not the case for all wavelet transform bases). The formule I use ends up at around 5 border pixels in the lowest level. Typically you would use periodic extensions, but you don’t have that information available in gigapixels image. Symmetric padding is also an option but I suppose they used zero padding for simplicity. That can cause ringing (Gibbs phenomenon) on the borders but as they are mostly white that’s probably minor.

The 5/3 Le Gall wavelet is asymmetric which can cause a small shift together with the small image sizes (rounding in the LL shapes) can cause this shift. However I don’t think it is trivial to compute, I can ask a wavelet expert.

If you don’t do this interpolation and you annotate a lymphocyte in level 0 it shifts across levels, not really desirable.

Falcury commented 1 year ago
def formula1(level):
    offset = 0
    if level > 0:
        offset = (3 << level) - 3
    return offset

def formula2(level, num_levels):
    offset = ((3 << num_levels) - 3) >> level
    return offset

offsets1 = [0] * 9
offsets2 = [0] * 9
for i in range(9):
    offsets1[i] = formula1(i)
    offsets2[i] = formula2(i, 9)

print(offsets1)
print(offsets2)

[0, 3, 9, 21, 45, 93, 189, 381, 765]
[1533, 766, 383, 191, 95, 47, 23, 11, 5]

Something is off here. I still think my 'reading' of the Philips docs is correct, but there is also something I am missing and getting wrong. In order to get a grip on this I think I have to first reproduce your findings using the TIFF converter, and understand exactly why my solution 'seems' to work in Slidescape.

jonasteuwen commented 1 year ago

Yes indeed the lower values are the zero padding of the image they do to get a proper wavelet decomposition. I also don’t understand why it works you would expect it has to decrease as well… If I used that I lost my lymphocytes

I will test against the Philips SDK and OpenPhi.

Falcury commented 1 year ago

Maybe it's a plus/minus thing, i.e. think of the formula in terms of the encoding process and doing that in reverse, instead of thinking of it in terms of the decoding process (keeping the top of the pyramid fixed) and correcting forward for each step.

jonasteuwen commented 1 year ago

Yes, it's not very clear from the documentation, indeed! From a mathematical point of view, I would start reasoning from the level where the data lives (level 0), that's how I also interpreted the formula, and from the point of view it makes sense to me. But let's look at the differences if the TIFFWriter works.

I could also test your formula, but probably we need a float version of that, as implementing it as it is causes minor shifts across levels. Can you let me know what this line should be: https://github.com/NKI-AI/libisyntax/blob/0a9ea0d6f5fa1f384852eba3a61bd6d4b723b3b9/src/libisyntax.c#L455

@Falcury you mention in #18 that you have a 1-1 correspondence without non-integer shifts. I do want to try that.

jonasteuwen commented 1 year ago

This is with the resampling:

https://user-images.githubusercontent.com/2347927/233985548-3e39e18c-d934-46cb-9438-a573900ebf9a.mov

I can also create one with the offsets you want

Falcury commented 1 year ago

Thanks for researching this so thoroughly! I have to test and try to understand what is going on using your findings, and apply the changes.

I think I may have some explanation why my incorrect formula seemed to work in Slidescape. The increasing offsets from level 0 upwards are applied in the viewer as negative offsets, which I guess works for displaying on a screen but then doesn't work for exporting as TIFF. So, I am guessing it might still be something akin to max_offset_for_level_0 - level_specific_negative_offset(level).

What I don't understand is, why are the offset differences small for level 0-1, and large for e.g. 6-7. And why are we seeing the opposite in the TIFF converted files.

Note that if I use your formula to initialize the offsets for iSyntax files, then the offsets are no longer corrected for in Slidescape.

I don't feel entirely comfortable about the floating point values, I suspect Philips' own interpretation probably also isn't using those. Maybe extracting the offsets from the coordinates of the codeblocks in the header and simply hardcoding them as a lookup could also be an option if we can't figure out the exact formula that they used.

Trying this now:

static void test_offsets(isyntax_t* isyntax, isyntax_image_t* wsi) {
    for (i32 scale = 0; scale < wsi->level_count; ++scale) {
        isyntax_level_t* level = wsi->levels + scale;
        isyntax_tile_t* tile = level->tiles;
        if (tile->exists) {
            isyntax_codeblock_t* codeblock = wsi->codeblocks + tile->codeblock_index;
            // Calculate offset of codeblock from the corner of the grid
            // NOTE: codeblocks with LL coefficients (only for topmost layer) have a different onset
            bool has_ll = codeblock->coefficient == 0;
            i32 offset;
            if (has_ll) {
                // (PER_LEVEL_PADDING << scale) - (PER_LEVEL_PADDING - 1) + (1 >> scale)
                offset = get_first_valid_ll_pixel(codeblock->scale);
            } else {
                // (PER_LEVEL_PADDING << scale) - (PER_LEVEL_PADDING - 1)
                offset = get_first_valid_coef_pixel(codeblock->scale);
            }
            // Correct for global offset from the corner of the slide
            i32 x_adjusted = codeblock->x_coordinate - wsi->offset_x;
            i32 y_adjusted = codeblock->y_coordinate - wsi->offset_y;
            console_print("level %d (tile 0,0): x=%d y=%d, offset=%d\n", scale, x_adjusted, y_adjusted, offset);
        }
    }
}

This matches the way block IDs (essentially tile_x, tile_y) are calculated from codeblock x, y coordinates. This gives the following output on testslide.isyntax:

level 0 (tile 0,0): x=1 y=1, offset=1
level 1 (tile 0,0): x=4 y=4, offset=4
level 2 (tile 0,0): x=10 y=10, offset=10
level 3 (tile 0,0): x=22 y=22, offset=22
level 4 (tile 0,0): x=46 y=46, offset=46
level 5 (tile 0,0): x=94 y=94, offset=94
level 6 (tile 0,0): x=190 y=190, offset=190
level 7 (tile 0,0): x=510 y=510, offset=510

Also take a look at this:

per_level_padding = 3

# also see Philips' extract_block_header.py
# equivalent in isyntax.c: get_first_valid_coef_pixel()
def formula1(level):
    offset = (per_level_padding << level) - (per_level_padding - 1)
    return offset

def formula2(level, maxscale):
    offset = ((per_level_padding << maxscale) - per_level_padding) >> level
    return offset

num_levels = 9
offsets1 = [0] * num_levels
offsets2 = [0] * num_levels
offsets3 = [0] * num_levels
offsets4 = [0] * num_levels
for i in range(num_levels):
    offsets1[i] = formula1(i)
    offsets2[i] = formula2(i, num_levels)
    offsets3[i] = formula1(num_levels - i)
    offsets4[i] = formula2(i, num_levels) - 1.5    

print(offsets1)
print(offsets2)
print(offsets3)
print(offsets4)

Output:

[1, 4, 10, 22, 46, 94, 190, 382, 766]
[1533, 766, 383, 191, 95, 47, 23, 11, 5]
[1534, 766, 382, 190, 94, 46, 22, 10, 4]
[1531.5, 764.5, 381.5, 189.5, 93.5, 45.5, 21.5, 9.5, 3.5]

Which is correct? I should try converting to TIFF files with these offsets but I still cannot seem to get that to work correctly for me.

jonasteuwen commented 1 year ago

Thanks! We should delve deeper into it. Indeed perhaps Philips doesn’t do the conversion with interpolation, but then again they designed the format for ‘web viewing’ (following the title of their paper). So they can apply the float offset easily in the viewer, just like you did.

I do have some TIFFs converted by a converter of Philips, I can check if they remain stable across levels. Unfortunately I don’t have the original isyntax anymore.

One thing we can conclude from my findings is that the initial offset is at least approximately correct, so the offsets above are too small (maybe you need to add the offset you get from (3 << max levels) - 3 to it?

Falcury commented 1 year ago

One thing we can conclude from my findings is that the initial offset is at least approximately correct, so the offsets above are too small (maybe you need to add the offset you get from (3 << max levels) - 3 to it?

Yeah something like that. Or maybe the third option in the python code above.

In Slidescape, the following also seems to work to visually correct the iSyntax files, also seemingly fixing a remaining 0.5 pixel shift between levels 1 and 0:

float origin_offset_in_pixels = ((PER_LEVEL_PADDING << scale) - PER_LEVEL_PADDING) * 0.5f;

This would make the visual shifts for levels 0-7:

[0.0, 1.5, 4.5, 10.5, 22.5, 46.5, 94.5, 190.5, 382.5]

From this, it seems indeed that some interpolation may be needed to have it fully stable without any shifts.

For constructing a TIFF, I guess we can get away with sampling only level 0 and downsampling to reconstruct the rest of the pyramid?

But, if this is indeed what is happening, this leads to some questions on how to proceed for querying tiles in iSyntax files? Do we simply round to whole pixel coordinates, accepting that this leads to a small discrepancy between levels 0 and 1? Or do we always interpolate either level 0 or everything above?

jonasteuwen commented 1 year ago

I think we would need the interpolation otherwise you would not be able to do cell detection at different resolutions.

We can indeed do the downsampling, but then we need to take care of aliasing. What I would be worried about is how the models will be generalized if you use these TIFFs to train models as likely during inference, you will just use libisyntax as-is on a level you require. There might be minor discrepancies between the images, which can have major effects in deep learning algorithms.

jonasteuwen commented 1 year ago

@Falcury I see in your latest SlideScape release you mention 'fixing offsets'. Does it mean you've changed it to the above? I can test it against the SDK.

I remember you mention there might be an offset tag available in tiff?

Falcury commented 1 year ago

@Falcury I see in your latest SlideScape release you mention 'fixing offsets'. Does it mean you've changed it to the above? I can test it against the SDK.

If I remember correctly, I mainly changed the code in Slidescape so that it's in sync with libisyntax, which included our work on identifying the offsets. (Slidescape renders the iSyntax levels at offsetted positions.) I think I put this part on a separate line in the release notes because it's something that is noticable by the user, instead of just refactoring.

I remember you mention there might be an offset tag available in tiff?

I think you're referring to the discussing under https://github.com/amspath/libisyntax/pull/18?

dregula commented 7 months ago

Gentleman- Hello and a very tardy comment on iSyntax tile offset. Philips was using iSyntax wavelet compression in their Radiology products, but likely never anticipated a tiled, multi-layer format, such as WSI.

There is a caveat in their own 2014 IntelliSpace User Guide (attached) regarding high-resolution mammograms:

"After receipt of such images in IntelliSpace PACS, when the iSyntax technology is applied, the wavelet representation of the discontinuity created by the applied pixel padding value can result in the presentation of the white outline when the image is viewed at a zoom value of 50% or less. This does not occur when the image is viewed at a zoom value of greater than 50%. This type of discontinuity can never occur in anatomy. The outline is caused by artificial pixel padding values, and appears outside of the breast tissue alongside the outer edge of the breast. The image quality is not affected by this problem. "

(https://github.com/amspath/libisyntax/files/14971752/IntelliSpace_PACS_Enterprise_with_iSyntax_User_Guide_2014.pdf)

As long as Philips restricted the use of iSyntax to tile-less images, the assumption was that the radiologist would quickly adapt to the surreptitious white padding. Thank you for all your great work on SlideScape and libisyntax!