areaDetector / ADEiger

areaDetector driver for the Dectris Eiger detector
https://areadetector.github.io/areaDetector/ADEiger/eiger.html
9 stars 9 forks source link

Crashing when trying to decompress bitshuffle/lz4 data on Stream interface #25

Closed MarkRivers closed 5 years ago

MarkRivers commented 5 years ago

Currently on the Stream interface it can only handle LZ4 compression. I am trying to add support for Bitshuffle/LZ4 compression.

I am calling the following bitshuffle function from bitshuffle 0.3.5:

bshuf_decompress_lz4(const void* in, void* out, const size_t size, const size_t elem_size, size_t block_size);

This is the functionin ADEiger that is doing the decompressing. The printf calls are for debugging.

int StreamAPI::uncompress (stream_frame_t *frame, char *dest)
{
    const char *functionName = "uncompress";

    printf("StreamAPI::uncompress, frame->encoding=%s, frame->compressedSize=%d, frame->uncompressedSize=%d\n",
           frame->encoding, (int)frame->compressedSize, (int)frame->uncompressedSize);
    if (strcmp(frame->encoding, "lz4<") == 0) {
        printf("Calling bshuf_decompress_lz4, uncompressedSize=%d\n",
               (int)frame->uncompressedSize);
        int result = LZ4_decompress_fast((const char *)frame->data, dest, (int)frame->uncompressedSize);
        if (result < 0)
        {
            ERR_ARGS("LZ4_decompress failed, result=%d\n", result);
            return STREAM_ERROR; 
        }
    } 
    else if (strcmp(frame->encoding, "bs32-lz4<") == 0) {
        size_t elemSize = 4;
        if (frame->type == stream_frame_t::UINT16) elemSize = 2;
        size_t numElements = frame->uncompressedSize/elemSize;
        printf("Calling bshuf_decompress_lz4, nmElements=%d, elemSize=%d, blockSize=%d\n",
               (int)numElements, (int)elemSize, 0);
        int result = bshuf_decompress_lz4((const char *)frame->data, dest, numElements, elemSize, 0);
        if (result < 0)
        {
            ERR_ARGS("bshuf_decompress_lz4 failed, result=%d", result);
            return STREAM_ERROR;
        }
    } 

    return STREAM_SUCCESS;
}

The LZ4 decompression is working fine. The problem is that the bitshuffle decompression is crashing with an access fault.

This is the output when the compression is LZ4. It decompresses fine.

StreamAPI::uncompress, frame->encoding=lz4<, frame->compressedSize=77242, frame->uncompressedSize=2117680
Calling bshuf_decompress_lz4, uncompressedSize=2117680

This is the output when the compression is BS/LZ4. It prints this message and then crashes in the call to bshuf_decompress_lz4().

StreamAPI::uncompress, frame->encoding=bs32-lz4<, frame->compressedSize=43130, frame->uncompressedSize=2117680
Calling bshuf_decompress_lz4, nmElements=529420, elemSize=4, blockSize=0
Segmentation fault

I am using the same call to bshuf_decompress_lz4 elsewhere in areaDetector and it is working.

Does anyone have example code on how to decompress the Eiger stream data when it is compressed with bs32-lz4?

MarkRivers commented 5 years ago

I stumbled across this Python code to decompress Eiger bs32-lz4 compressed buffers here: https://github.com/cctbx/cctbx-playground/blob/master/dxtbx/format/FormatEigerStream.py

blob = np.fromstring(data[12:], dtype=np.uint8)
# blocksize is big endian uint32 starting at byte 8, divided by element size
blocksize = np.ndarray(shape=(), dtype=">u4", buffer=data[8:12])/4
imgData = bitshuffle.decompress_lz4(blob, shape[::-1], np.dtype(dtype), blocksize)
return imgData

This suggests that actual compressed data starts at byte 12 in the buffer, unlike lz4 which starts at byte 0. For the Python code they are decoding the blocksize to use.

I tried changing my C code to use offset 12 in the buffer, but left the blocksize=0. That seems to work!

Questions:

MarkRivers commented 5 years ago

I have now studied the Eiger SIMPLON API Reference 1.6.x, Document Version: 4.

This is what it says about the Stream data on page 28.


6.4.2.2. Image Data Zeromq multipart message consisting of the following parts:

Note that it says (correctly I believe) that “lz4 data is written as defined at https://code.google.com/p/lz4/ without any additional data like block size etc.”.

However, it does not say anything about how bitshuffle/lz4 data is written. I have found that the compressed data actually begins at byte 12 in the data blob. But I would like to know exactly what is in the first 12 bytes as well.

Why is the bitshuffle/lz4 data format not described in this document?

MarkRivers commented 5 years ago

Since I have solved this problem I am closing the issue. Still waiting for Dectris to provide documentation on the data blob when compression is bitshuffle/lz4.