ornladios / ADIOS2

Next generation of ADIOS developed in the Exascale Computing Program
https://adios2.readthedocs.io/en/latest/index.html
Apache License 2.0
270 stars 126 forks source link

BZip2 compression fails with BZ_OUTBUFF_FULL error on small data sizes #4388

Open dqwu opened 4 hours ago

dqwu commented 4 hours ago

Issue Description

When using ADIOS2 2.10.1 release with BZip2 compression enabled, attempts to compress very small datasets (in this case, an array of 9 integers) result in a runtime error: [ADIOS2 ERROR] <Helper> <adiosSystem> <ExceptionToError> : adios2_put: [ADIOS2 EXCEPTION] <Operator> <CompressBZIP2> <CheckStatus> : BZ_OUTBUFF_FULL BZIP2 detected size of compressed data is larger than destination length in call to ADIOS2 BZIP2 Compress batch 0

This error is unexpected, as one might assume smaller data sizes would be easier to compress, and no buffer issues would arise. Setting the data length to 10 or higher avoids the error.

Expected Behavior

The library should handle small data sizes without a buffer overflow or provide a more descriptive error message indicating limitations on BZip2 compression for small datasets.

Proposed Solution

To improve usability, the ADIOS2 library could:

Test Case

The following code reproduces the issue. When DATA_LEN is set to 9, it triggers the error, whereas setting it to 10 works as expected.

#include <adios2_c.h>

#define DATA_LEN 9

int main(int argc, char *argv[])
{
    adios2_adios *adios = adios2_init_serial();
    adios2_io *bpIO = adios2_declare_io(adios, "BP5WriterWithComp");
    adios2_set_engine(bpIO, "BP5");

    adios2_engine *bpWriter = adios2_open(bpIO, "BZip2_compression.bp", adios2_mode_write);

    size_t count = DATA_LEN;
    adios2_variable *var = adios2_define_variable(bpIO, "data_with_comp", adios2_type_int32_t, 1, NULL, NULL, &count, adios2_constant_dims_true);

    adios2_operator *op = adios2_define_operator(adios, "BZip2Lossless", "bzip2");

    size_t operation_index;
    adios2_add_operation(&operation_index, var, op, "", "");

    int data[DATA_LEN] = {0};
    adios2_put(bpWriter, var, data, adios2_mode_sync);

    adios2_close(bpWriter);
    adios2_finalize(adios);

    return 0;
}
dqwu commented 4 hours ago

@pnorbert Could you please take a look at this or assign it to someone familiar with BZip2 in ADIOS? Thanks.