powturbo / TurboBench

Compression Benchmark
326 stars 34 forks source link

zstd and brotli both produce incorrect output for big file #40

Closed safinaskar closed 1 year ago

safinaskar commented 1 year ago

Hi. zstd and brotli both produce incorrect output for file https://cdimage.debian.org/debian-cd/current/amd64/iso-dvd/debian-12.0.0-amd64-DVD-1.iso (3.7 G). Here is output:

# ./turbobench -D -B4G -ebrotli,1/zstd,1 /tmp/debian-12.0.0-amd64-DVD-1.iso 
           0     0.0      -0.00   -1023.18   brotli 1         debian-12.0.0-amd64-DVD-1.iso
ERROR at 868478448:3f, 66 file=debian-12.0.0-amd64-DVD-1.iso
  3677323496    93.5    -240.61    -689.08   zstd 1           debian-12.0.0-amd64-DVD-1.iso
ERROR at 869007360:34, cb file=debian-12.0.0-amd64-DVD-1.iso
TurboBench:  - Sun Jun 11 10:13:25 2023

      C Size  ratio%     C MB/s     D MB/s   SCORE      Name            File
           0     0.0       0.00   11053.89   167423219872854250310480057943336997924438274955232624522333477440436613541123027690304032184897808517832736441413296818951444073477260938629943455293343883926095357701695786860068785110851124808262911167419640293628520552910140667265102788183953006076438190605626183719130397942773345747264478380032.00   ?brotli 1       debian-12.0.0-amd64-DVD-1.iso
  3677323496    93.5    2599.46    7444.48    3679.89   ?zstd 1         debian-12.0.0-amd64-DVD-1.iso

Test was performed in docker container with debian sid in it. Using gcc 12.2.0 from debian. Using turbobench commit 1d17683531537ca953ae7200fb36198ad5ab2ae6 .

Here is full reproducer for docker:

FROM debian:sid-20230522
ENV LC_ALL C.UTF-8
RUN apt-get update && apt-get install -y apt-utils whiptail
RUN apt-get update && apt-get install -y git make gcc g++ wget ca-certificates
RUN git clone --recurse https://github.com/powturbo/TurboBench
WORKDIR TurboBench
RUN git checkout 1d17683531537ca953ae7200fb36198ad5ab2ae6
RUN make
RUN wget 'https://cdimage.debian.org/debian-cd/current/amd64/iso-dvd/debian-12.0.0-amd64-DVD-1.iso'
RUN ./turbobench -D -B4G -ebrotli,1/zstd,1 debian-12.0.0-amd64-DVD-1.iso
powturbo commented 1 year ago

TurboBench is an in-memory benchmark program and try to load the whole file into RAM before starting the benchmark. zstd and brotli can't process such huge block. I suggest to try to benchmark only the first GB block by using the option "-B"

turbobench file -B1G

safinaskar commented 1 year ago

So there is a bug in zstd and brotli? This is unlikely. I'm nearly sure that zstd and brotli at very least try to somehow singal error instead of producing incorrect data

safinaskar commented 1 year ago

I just tested lzbench with zstd and brotli with the same file. There is no bugs. So there is no problem on zstd's and brotli's side. Of course, if you don't want to support such big files, it is okay. Just please provide better error message than "ERROR at 868478448:3f", such error looks like there is a problem with compressor itself.

(I use x86_64)

powturbo commented 1 year ago

Well, TurboBench doesn't know the maximum allowed block size of the codecs and it's not checking every return code from the codecs. It's checking the buffer after decompression against the input buffer. As you can see in your output, it's reporting the errors. As stated before use 1GB block or split you file for benchmarking.

safinaskar commented 1 year ago

it's not checking every return code

This is bad

use 1GB block or split you file for benchmarking

I write program for manipulating VM images. Typical image size is 10G. I want to choose good compressor for this job. I hope some compressor will exploit long distance similarities in such images, so 1G limit is not okay for me.

As I said, this is one of the reasons to choose lzbench over turbobench.

Also, I just replaced all -O3 (and similar) in makefile with -O0 -g and run command ./turbobench -B4G -ezstd,1/brotli,1 /tmp/debian-12.0.0-amd64-DVD-1.iso. And I got Segmentation fault (core dumped). So, something gone really wrong.

Okay, now let me be more constructive. I debugged the program using gdb and found the source of segfault. This patch fixes the segfault:

diff --git a/time_.h b/time_.h
index 9545825..2425cd9 100755
--- a/time_.h
+++ b/time_.h
@@ -260,5 +260,5 @@ static uint64_t argtot(char *s) {
   return n*f;
 }

-static void memrcpy(unsigned char *out, unsigned char *in, unsigned n) { int i; for(i = 0; i < n; i++) out[i] = ~in[i]; }
+static void memrcpy(unsigned char *out, unsigned char *in, unsigned n) { unsigned i; for(i = 0; i < n; i++) out[i] = ~in[i]; }

But I still see error messages about incorrect compressing/decompressing. Likely reason is using int and unsigned for storing numbers larger than maximal values for this types. In particular, gdb shows that the code tries to pass too big outsize to codcomp. But when I changed type to long the program still continues to perform incorrectly, so it seems there are other errors

powturbo commented 1 year ago

I've updated TurboBench to support big files of unlimited size. You can pass the codec block size with the option -b# (ex. -b100m or -b1g ). As most of the codecs are limited to 1Gb block size, I suggest to use the default (without -b# = -b1G).

./turbobench -ezstd,1 bigfile

Unlike the 1 block benchmark, the results are shown at the end of benchmark.

For your usage, I suggest to use long range or dedup compressors (lrzip,srep,...)

safinaskar commented 1 year ago

@powturbo , thanks a lot! Unfortunately, problems remain. Here is output on 10G random data:

# rm -f *.tbb; ./turbobench -b3G -D -ezstd,1/brotli,1/memcpy /tmp/2Gi
 10737664033   100.0    1941.29    6749.46   10746.38   zstd 1          2Gi
 10737428495   100.0    1952.04    4495.78   10747.71   brotli 1        2Gi
47080110157571046199Segmentation fault (core dumped)

So, we see core dump on memcpy. 2Gi is 10G random data from urandom (sorry for misnomer here).

Tested on 3e2989f7c8c6d957f9eaa5d1095110a84fc118d7.

Also, I already chose compressor for my application some days ago. I decided simply to stick with zstd. :) So my interest here is theoretical now (i. e. "what if I will want to use turbobench in some distant future?").

Also, it seems from https://github.com/powturbo/TurboBench/blob/3e2989f7c8c6d957f9eaa5d1095110a84fc118d7/plugins.cc#L1544 that size of single block is still limited to 4G. I don't like this. I want it to be limited by size of RAM (for example, my RAM is 32G)

powturbo commented 1 year ago

I've fixed the memcpy issue. I see no sense to use more than 4GB as block size. Allmost all of the codecs are already limited or crash at large block size. Big file are processed in chunks. Additionally TurboBench is a In-Memory benchmark used primarly test the speed and efficiency. Lzbench block size is limited at 1,6GB.

safinaskar commented 1 year ago

I've fixed the memcpy issue

Yes, I just checked