Fix BinData in Mask in ROI.ome.xml

will-moore commented 3 years ago

As reported at https://forum.image.sc/t/prepping-and-including-roi-masks/48750/8 the Mask in the ROI.ome.xml example doesn't display in iviewer. This is because the BinData is too long (for a 2 x 2 mask), and fails to be converted into an image to view in iviewer.

To create the corrected BinData we used this approach, (based on mask_from_binary_image at https://github.com/ome/omero-rois/blob/4b9a388f6def81fc23689aab803479ba3e7fe28a/src/omero_rois/library.py#L94):

>>> import numpy as np
# start with bits for a 2 x 2 array
>>> bitarray = np.array([0, 1, 1, 0], dtype=np.bool)
>>> mask_packed = np.packbits(bitarray)
>>> mask_packed
array([96], dtype=uint8)
>>> mask_string = np.ndarray.tostring(mask_packed)
>>> mask_string
b'`'
>>> from base64 import b64encode
>>> b64encode(mask_string)
b'YA=='

When imported into OMERO, and viewed in iviewer the fixed <BinData> renders a 2 x 2 mask (although this appears misaligned a little in iviewer, probably because it's such a small image & mask).

imagesc-bot commented 3 years ago

This pull request has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/prepping-and-including-roi-masks/48750/10

rleigh-codelibre commented 3 years ago

The change here is from

% echo -n "AOnpAA==" | base64 -D | hexdump
0000000 00 e9 e9 00                                    
0000004

To

% echo -n "YA==" | base64 -D | hexdump
0000000 60                                             
0000001

Given that the mask is 2x2 pixels, is this valid? It has set bits 5 and 6, but it's only possible to set bits 0-3? If that is true then you want between 0x00 (AA==) to 0x0F (Dw==)?

will-moore commented 3 years ago

It seems that with the numpy implementation above, when we only need half a byte, the most significant bits of the byte are used. So [0, 1, 1, 0] in the example above is equivalent to 01100000 (96).

This is working with the way it's implemented in webgateway/render_shape_mask/, although I can't tell if it looks the same in Insight (too small to see individual pixels). Maybe someone could check what Insight is doing e.g. with https://merge-ci.openmicroscopy.org/web/webclient/img_detail/184718/

In iviewer this looks like this (NB: some issue with the way the <Transform> is being applied):

Screenshot 2021-02-24 at 13 44 59

sbesson commented 3 years ago

In terms of integration, importing the sample OME-XML with the proposed changes into OMERO now results in the expected behavior with the mask is displayed in yellow with the expected values (0, 1, 1,0)

Screenshot 2021-02-24 at 13 25 49

An interesting UI bug uncovered by this work is that while the mask rendering is correctly translated according the Transform element, the bounding box only seems to be set only by the metadata in the Mask element. This issue only seems to affect masks as the transformed rectangle renders correctly and can be captured separately.

Screenshot 2021-02-24 at 13 26 05

On the BinData element, there is certainly some unclarity on the rules to encode the mask, the most recent example of it being my comment in https://forum.image.sc/t/prepping-and-including-roi-masks/48750/5 suggesting the bit array needs to be base64 encoded. From our investigation with @will-moore, it looks like the bit array needs to be turned a byte array first, optionally compressed e.g. with zlib, then encoded into base64 format.

As the length of bit arrays cannot be expected to be systematically a multiple of 8, this raises the question of the expectations for the extra bits as rightfully raised in https://github.com/ome/ome-model/pull/140#issuecomment-784554233. From an implementation perspective, at least the numpy.packbits behavior is to pad by using zeros at the end. This matches the webgateway decoding implementation in https://github.com/ome/omero-web/blob/6049673b5ee43cb01149456588c1e8ed3d0b7b8c/omeroweb/webgateway/views.py#L802-L808 although arguably since the width and the length of the bit array are given by the Mask element, should the value of the remaining bits be considered as irrelevant as these should be discarded by the consumer code to only keep [0: width*height -1] ?

rleigh-codelibre commented 3 years ago

For a language-agnostic file format specification, I think you need to be careful here. I don't think that you should simply do it "the numpy way" without really thinking through the consequences. OME-TIFF and OME-XML need to be consumed more generally than that. I think you should properly understand how numpy is working here and explore all the edge cases to fully determine how the packing behaves. The described behaviour here is counter to my previous understanding of it when we used it in ome_files_py. It's using standard nD array strides and offsets per dimension in the underlying C. It maps straight on to how the OME-Files PixelBuffer/boost::multi_array represent nD arrays. I really would urge you to properly explore the problem space before committing to this. I am quite sceptical that it's the right decision.

I would encourage you to look how TIFF stores packed bit values in both strips and tiles. It has a written specification for this, plus a reference implementation in libTIFF. It has minimum padding requirements for rows. It's already implemented correctly by OME-Files, and I think for a metadata format whose use case is primarily for OME-TIFF, using the TIFF packing semantics would be the more portable choice here. Not least because that's what most of the TIFF-using programs are already using themselves. I think @sbesson's comments above regarding the zero padding are spot on, and match my findings with TIFF detailed in the comment below.

Regards, Roger

rleigh-codelibre commented 3 years ago

I created a selection of 2x2 TIFF images with 1 sample per pixel: bit-tiff.zip. Run tiffinfo to inspect them. All bits are set to 1 in the 2x2 image area.

Example:

% tiffinfo -s data-layout-2x2-planar-tiles-16x16.tiff
TIFF Directory at offset 0x28 (40)
  Image Width: 2 Image Length: 2
  Tile Width: 16 Tile Length: 16
  Bits/Sample: 1
  Sample Format: unsigned integer
  Compression Scheme: None
  Photometric Interpretation: min-is-black
  Samples/Pixel: 1
  Planar Configuration: single image plane
  Page Number: 0-1
  DocumentName: /Users/rleigh/code/ome-files/cmake-build-debug/ome-files/test/data/data-layout-2x2-planar-tiles-16x16.tiff
  Software: GraphicsMagick 1.3.36 20201226 Q16 http://www.GraphicsMagick.org/
  1 Tiles:
      0: [       8,       32]

Note that the pixel data starts at offset 8 and is 32 bytes. The tile size is 16x16 and the storage size is 16x16/8 = 32. Look at the raw data:

% hexdump data-layout-2x2-planar-tiles-16x16.tiff | head -n4
0000000 49 49 2a 00 28 00 00 00 c0 bf c0 bf bf bf bf bf
0000010 bf bf bf bf bf bf bf bf bf bf bf bf bf bf bf bf
0000020 bf bf bf bf bf bf bf bf 0f 00 00 01 03 00 01 00
0000030 00 00 02 00 00 00 01 01 03 00 01 00 00 00 02 00

Parts of the tile which aren't in the image area are padded with bf. But the bytes which are are c0. So, that's 11000000. The mask is MSB-first. But it's padded out to the end of the byte. TIFF tiles are a minimum of 16x16 even though the image size is 2x2. Each row is two bytes.

Looking at strips instead of tiles. Starting at one strip per row.

% tiffinfo -s data-layout-2x2-planar-strips-01.tiff
TIFF Directory at offset 0xa (10)
  Image Width: 2 Image Length: 2
  Bits/Sample: 1
  Sample Format: unsigned integer
  Compression Scheme: None
  Photometric Interpretation: min-is-black
  Samples/Pixel: 1
  Rows/Strip: 1
  Planar Configuration: single image plane
  Page Number: 0-1
  DocumentName: /Users/rleigh/code/ome-files/cmake-build-debug/ome-files/test/data/data-layout-2x2-planar-strips-01.tiff
  Software: GraphicsMagick 1.3.36 20201226 Q16 http://www.GraphicsMagick.org/
  2 Strips:
      0: [       8,        1]
      1: [       9,        1]

Two strips at bytes 8 and 9, one byte per row, one byte per strip.

% hexdump data-layout-2x2-planar-strips-01.tiff | head -n1
0000000 49 49 2a 00 0a 00 00 00 c0 c0 0e 00 00 01 03 00

You can see that bytes 8 and 9 are c0. So again, we're padded out to the end of the byte for each row.

Now lets look at a stripsize of 2:

% tiffinfo -s data-layout-2x2-planar-strips-02.tiff
TIFF Directory at offset 0xa (10)
  Image Width: 2 Image Length: 2
  Bits/Sample: 1
  Sample Format: unsigned integer
  Compression Scheme: None
  Photometric Interpretation: min-is-black
  Samples/Pixel: 1
  Rows/Strip: 2
  Planar Configuration: single image plane
  Page Number: 0-1
  DocumentName: /Users/rleigh/code/ome-files/cmake-build-debug/ome-files/test/data/data-layout-2x2-planar-strips-02.tiff
  Software: GraphicsMagick 1.3.36 20201226 Q16 http://www.GraphicsMagick.org/
  1 Strips:
      0: [       8,        2]

One byte per row; two bytes per strip.

% hexdump data-layout-2x2-planar-strips-02.tiff | head -n1
0000000 49 49 2a 00 0a 00 00 00 c0 c0 0e 00 00 01 03 00

Again, it's c0, with each row padded out to the end of the byte. Same with large strip sizes.

Please feel free to examine the other TIFF files in more detail.

To summarise: TIFF stores packed bits (unsigned integer, SPP=1) as MSB-first tightly-packed bitfields which are zero-padded to the nearest byte on row boundaries. This applies to both strips and tiles. This should be considered when specifying the behaviour of Mask BinData.

Given the historically poor lack of specificity surrounding BinData, I would strongly encourage you to consider supplementing BinData in Mask with TiffData allowing for efficient and compact storage of masks as actual images in a format designed for the purpose. Deprecating BinData and using TiffData would be a far better solution to the problem. Not only is it more efficient to store both for the mask data itself and the reduced bloat in the XML, it also means all of the same pixeldata buffer formats and processing code can be shared between images and masks. Which, given that they are both images, would make a great deal more sense. It also means that it can be efficiently compressed, broken up into strips and tiles, and potentially use more than one sample per pixel. Masks are images, and the OME-XML model should really be treating them as such. The Base64 BinData is a legacy which we should have replaced long ago, back when we did the same for Pixels.

imagesc-bot commented 3 years ago

This pull request has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/prepping-and-including-roi-masks/48750/16

will-moore commented 3 years ago

Hi Roger, Thanks for taking the time to provide all the feedback. Seems to make a lot of sense - just need to get my head around it etc. I think the team will discuss a bit more widely and decide various steps we need to take etc, including improved documentation as mentioned on the forum discussion. Cheers, Will.

blasky commented 3 years ago

Given the historically poor lack of specificity surrounding BinData, I would strongly encourage you to consider supplementing BinData in Mask with TiffData allowing for efficient and compact storage of masks as actual images in a format designed for the purpose.

For what it's worth as a novice consumer of the Xsd and OME tech, this was my initial assumption. I figured our internal Png mask files would need to be converted to Tiff, Base64 encoded, and embedded in the Xml. It wasn't until the discussion on Image.sc that I realized that the data structure for the Mask BinData was quite different and specialized.

will-moore commented 3 years ago

As discussed by OME team, closing this PR (discussion to be moved to a new issue). FYI: OME is working on a "next-generation" file format based on zarr, where masks are zarr arrays and referred to as "labels" (see https://ngff.openmicroscopy.org/latest/).

sbesson commented 3 years ago

A couple of comments in addition to https://github.com/ome/ome-model/pull/140#issuecomment-788018907

the BinData specification was established in the early 2000 and discussed in https://genomebiology.biomedcentral.com/articles/10.1186/gb-2005-6-5-r47
it largely predates the appearance of OME-TIFF roughly around 2007
for the Image section of the OME model, the BinData specification should largely be regarded as legacy and has been superseded by the introduction of the TiffData element
the general consensus while reviewing this with the @ome/formats team is that no-one feels authoritative about the correct specification for Mask.BinData
OME has built some implementations focus on importing an OME-XML Mask into OMERO however other implementation might have made different assumptions
some flexible able to associate a mask/label to a file (via annotations) would be valuable
dropping Mask.BinData in favor or Mask.TiffData had been mentioned during the previous model effrots
as a general rule, modifications to the OME model are very costly multi-month efforts that typically result in all (code-generated) API being broken across the board. This is one of the reason why no such change has happened since 2016.

Within the scope of this discussion, the proposed next steps are:

close this Pull Request as the change cannot be approved as the authoritative way to encode the binary mask data as it stands - https://github.com/ome/ome-model/pull/140#issuecomment-788018907
move the specification discussion that was initiated in this PR into an ome-model issue, try to collect more use cases and implementations and reach a community agreement before making any formal decision
as per the ambiguity, open a follow-up PR to comment out the Mask element from the official ROI.ome.xml sample and link to the corresponding issue (or remove it completely)

imagesc-bot commented 2 years ago

This pull request has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/prepping-and-including-roi-masks/48750/22

ome / ome-model

Fix BinData in Mask in ROI.ome.xml #140