mathieuchartier / mcm

MCM file compressor
GNU General Public License v3.0
108 stars 24 forks source link

check failed ret == count: repeated errors on several datasets #8

Open Intensity opened 7 years ago

Intensity commented 7 years ago

I've come across that error pretty frequently in a variety of cases. The input is pretty large - about 20GB to 30GB. This is on Linux x86_64 (statically compiled, but this is built in a Debian 7 environment) - and the build was based off of commit 02b4e7d50a39f94d17ab3bf3d49a597a10b1f4a1 from Sat Apr 4 22:02:17 2015 -0700.

I've mostly had success with mcm, but I'm wondering what might cause this, and how I'd provide more helpful information here to debug moving forward. I'll go ahead and recompile an updated mcm (including latest git commits), but since segfaults or core dumps can lead to data loss (under certain conditions) I wanted to point this out. In one case I even think I did a successful "-x11 -test" but yet the request to decompress afterward yielded this check failure.

Is the check a hard requirement (that is, if that assert fires is there necessarily something wrong)? Is there a way to mitigate or reduce the likelihood that it'll happen? Is there a recovery path? Just want to avoid the situation where I've compressed large amounts of data and I'm not able to decompress in the future.

$ mcm d foo.mcm
======================================================================
mcm compressor v0.83, by Mathieu Chartier (c)2015 Google Inc.
Experimental, may contain bugs. Contact mathieu.a.chartier@gmail.com
Special thanks to: Matt Mahoney, Stephan Busch, Christopher Mattern.
======================================================================
Decompresing file foo
Metadata size=2185310
binary size 6841237436
text size 6731721113
wav16 size 747373
Decompressing binary stream size=6,841,237,436
check failed ret == count95KB/s ratio: 0.55790
Segmentation fault (core dumped)

$ mcm -h9 foo2 foo2.mcm
mcm compressor v0.83, by Mathieu Chartier (c)2015 Google Inc.
Experimental, may contain bugs. Contact mathieu.a.chartier@gmail.com
Special thanks to: Matt Mahoney, Stephan Busch, Christopher Mattern.
======================================================================
Compressing to foo2.mcm mode=high mem=9

Analyzing
30323781KB , 59748KB/s
text : 186154(222.007MB)
binary : 186155(28.7022GB)
Analyzing took 507.522s

Compressed metadata 1316287 -> 433393

Compressing binary stream size=30,818,760,758
check failed ret == count1307KB/s ratio: 0.97091
Segmentation fault (core dumped)

$ mcm -m9 -test foo3 foo3.mcm
======================================================================
mcm compressor v0.83, by Mathieu Chartier (c)2015 Google Inc.
Experimental, may contain bugs. Contact mathieu.a.chartier@gmail.com
Special thanks to: Matt Mahoney, Stephan Busch, Christopher Mattern.
======================================================================
Compressing to foo3.mcm mode=mid mem=9
Analyzing
28698319KB , 60824KB/s
text : 1(241B)
binary : 2(27.3689GB)
Analyzing took 471.82s

Compressed metadata 35 -> 35

Compressing binary stream size=29,387,079,103
check failed ret == count1294KB/s ratio: 0.97486
Segmentation fault (core dumped)

$ mcm -x11 -test foo4 foo4.mcm
======================================================================
mcm compressor v0.83, by Mathieu Chartier (c)2015 Google Inc.
Experimental, may contain bugs. Contact mathieu.a.chartier@gmail.com
Special thanks to: Matt Mahoney, Stephan Busch, Christopher Mattern.
======================================================================
Compressing to foo4.mcm mode=max mem=11
Analyzing
30323781KB , 56777KB/s
text : 186154(222.007MB)
binary : 186155(28.7022GB)
Analyzing took 534.084s

Compressed metadata 1316287 -> 433392

Compressing binary stream size=30,818,760,758
check failed ret == count997KB/s ratio: 0.964599
Segmentation fault (core dumped)
mathieuchartier commented 7 years ago

Hey, are you sure your drive is not full? It seems strange that fwrite would not complete writing. I put a diff here to maybe address EINTR if that could happen. https://github.com/mathieuchartier/mcm/commit/c5a86b1f54f90aa63953c5b683ba92c417e52f0c