Blosc / bcolz

A columnar data container that can be compressed.
http://bcolz.blosc.org
959 stars 149 forks source link

Latest c-blosc causes test failures in bcolz master #374

Closed estan closed 6 years ago

estan commented 6 years ago

@FrancescAlted It seems we've hit another hiccup with the Debian/Ubuntu packaging of c-blosc 1.14.2. The package has been held back because it causes test failures in bcolz.

Here's the Ubuntu test log and here's the Debian test log.

I was able to reproduce the failures using bcolz Git master branch in an Ubuntu 18.04 Docker container with the 1.14.2 package installed (see below). Any idea what could be the problem?

root@f0b7772b23f5:~/bcolz# python3 -c"import bcolz; bcolz.test()"
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
bcolz version:     1.1.3.dev15
bcolz git info:    b'1.1.2-15-g20cbcf3'
NumPy version:     1.13.3
Blosc version:     1.14.2 ($Date:: 2018-03-16 #$)
Blosc compressors: ['blosclz', 'lz4', 'lz4hc', 'snappy', 'zlib', 'zstd']
Numexpr version:   not available (version >= 2.5.2 not detected)
Dask version:   not available (version >= 0.9.0 not detected)
Python version:    3.6.5rc1 (default, Mar 14 2018, 06:54:23) 
[GCC 7.3.0]
Platform:          linux-x86_64
Byte-ordering:     little
Detected cores:    4
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Performing only a light (yet comprehensive) subset of the test suite.
If you want a more complete test, try passing the '-heavy' flag to this
script (or set the 'heavy' parameter in case you are using bcolz.test()
call).  The whole suite will take more than 30 seconds to complete on a
relatively modern CPU and around 300 MB of RAM and 500 MB of disk
[32-bit platforms will always run significantly more lightly].

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
......s......s......s......s............................................................FF.............................................................................................ssssssssssssssssssssssssssss..............ssssssssssssssssssssssssssss..............ssssssssssssssssssssssssssss..............ssssssssssssssssssssssssssssssssssssssssssssssssssssssss...............FF.FFF................................................ss..........................................ssssss...............F..F..........FFF.............................................................................................................................................................................................FF.FFF................................................................................................................ssss...................../root/bcolz/bcolz/ctable.py:48: FutureWarning: split() requires a non-empty pattern match.
  return re_str_split.split(str(x))
.............................sss.........s.......................F....................................................................................................................................................................................................................................................................................................................................................................
======================================================================
FAIL: test01b (test_carray.bloscFiltersTest)
Testing all available filters in big arrays (bcolz.defaults)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/bcolz/bcolz/tests/test_carray.py", line 2322, in test01b
    "carray does not seem to compress at all")
AssertionError: False is not true : carray does not seem to compress at all

======================================================================
FAIL: test01c (test_carray.bloscFiltersTest)
Testing all available filters in big arrays (context)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/bcolz/bcolz/tests/test_carray.py", line 2344, in test01c
    self.assertTrue(bcolz.defaults.cparams['shuffle'] == bcolz.SHUFFLE)
AssertionError: False is not true

======================================================================
FAIL: test01b (test_carray.filtersDiskTest)
Testing all available filters in big arrays (bcolz.defaults)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/bcolz/bcolz/tests/test_carray.py", line 2322, in test01b
    "carray does not seem to compress at all")
AssertionError: False is not true : carray does not seem to compress at all

======================================================================
FAIL: test01c (test_carray.filtersDiskTest)
Testing all available filters in big arrays (context)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/bcolz/bcolz/tests/test_carray.py", line 2344, in test01c
    self.assertTrue(bcolz.defaults.cparams['shuffle'] == bcolz.SHUFFLE)
AssertionError: False is not true

======================================================================
FAIL: test01a (test_carray.filtersMemoryTest)
Testing all available filters in big arrays (setdefaults)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/bcolz/bcolz/tests/test_carray.py", line 2302, in test01a
    "carray does not seem to compress at all")
AssertionError: False is not true : carray does not seem to compress at all

======================================================================
FAIL: test01b (test_carray.filtersMemoryTest)
Testing all available filters in big arrays (bcolz.defaults)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/bcolz/bcolz/tests/test_carray.py", line 2322, in test01b
    "carray does not seem to compress at all")
AssertionError: False is not true : carray does not seem to compress at all

======================================================================
FAIL: test01c (test_carray.filtersMemoryTest)
Testing all available filters in big arrays (context)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/bcolz/bcolz/tests/test_carray.py", line 2344, in test01c
    self.assertTrue(bcolz.defaults.cparams['shuffle'] == bcolz.SHUFFLE)
AssertionError: False is not true

======================================================================
FAIL: test01 (test_carray.miscDiskTest)
Testing __sizeof__() (big carrays)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/bcolz/bcolz/tests/test_carray.py", line 639, in test01
    "carray does not seem to compress at all")
AssertionError: False is not true : carray does not seem to compress at all

======================================================================
FAIL: test01 (test_carray.miscMemoryTest)
Testing __sizeof__() (big carrays)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/bcolz/bcolz/tests/test_carray.py", line 639, in test01
    "carray does not seem to compress at all")
AssertionError: False is not true : carray does not seem to compress at all

======================================================================
FAIL: test_repr_disk_array_append (test_carray.reprDiskTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/bcolz/bcolz/tests/test_carray.py", line 2430, in test_repr_disk_array_append
    self.assertEqual(expected, repr(y))
AssertionError: "carr[90 chars]evel=5, shuffle=1, cname='blosclz', quantize=0[119 chars]\n[]" != "carr[90 chars]evel=9, shuffle=0, cname='blosclz', quantize=0[119 chars]\n[]"
  carray((0,), float64)
    nbytes := 0; cbytes := 16.00 KB; ratio: 0.00
-   cparams := cparams(clevel=5, shuffle=1, cname='blosclz', quantize=0)
?                             ^          ^
+   cparams := cparams(clevel=9, shuffle=0, cname='blosclz', quantize=0)
?                             ^          ^
    chunklen := 2048; chunksize: 16384; blocksize: 0
    rootdir := '/tmp/bcolz-reprDiskTest5yj4sw34'
    mode    := 'a'
  []

======================================================================
FAIL: test_repr_disk_array_read (test_carray.reprDiskTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/bcolz/bcolz/tests/test_carray.py", line 2424, in test_repr_disk_array_read
    self.assertEqual(expected, repr(x))
AssertionError: "carr[90 chars]evel=5, shuffle=1, cname='blosclz', quantize=0[119 chars]\n[]" != "carr[90 chars]evel=9, shuffle=0, cname='blosclz', quantize=0[119 chars]\n[]"
  carray((0,), float64)
    nbytes := 0; cbytes := 16.00 KB; ratio: 0.00
-   cparams := cparams(clevel=5, shuffle=1, cname='blosclz', quantize=0)
?                             ^          ^
+   cparams := cparams(clevel=9, shuffle=0, cname='blosclz', quantize=0)
?                             ^          ^
    chunklen := 2048; chunksize: 16384; blocksize: 0
    rootdir := '/tmp/bcolz-reprDiskTestiflbef87'
    mode    := 'r'
  []

======================================================================
FAIL: test_repr_disk_array_write (test_carray.reprDiskTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/bcolz/bcolz/tests/test_carray.py", line 2419, in test_repr_disk_array_write
    self.assertEqual(expected, repr(x))
AssertionError: "carr[90 chars]evel=5, shuffle=1, cname='blosclz', quantize=0[119 chars]\n[]" != "carr[90 chars]evel=9, shuffle=0, cname='blosclz', quantize=0[119 chars]\n[]"
  carray((0,), float64)
    nbytes := 0; cbytes := 16.00 KB; ratio: 0.00
-   cparams := cparams(clevel=5, shuffle=1, cname='blosclz', quantize=0)
?                             ^          ^
+   cparams := cparams(clevel=9, shuffle=0, cname='blosclz', quantize=0)
?                             ^          ^
    chunklen := 2048; chunksize: 16384; blocksize: 0
    rootdir := '/tmp/bcolz-reprDiskTestz450jgp7'
    mode    := 'w'
  []

======================================================================
FAIL: test02 (test_ctable.copyDiskTest)
Testing copy() with lower clevel
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/bcolz/bcolz/tests/test_ctable.py", line 1025, in test02
    self.assertTrue(t['f1'].cbytes < t2['f1'].cbytes, "clevel not changed")
AssertionError: False is not true : clevel not changed

======================================================================
FAIL: test03 (test_ctable.copyDiskTest)
Testing copy() with no shuffle
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/bcolz/bcolz/tests/test_ctable.py", line 1036, in test03
    self.assertTrue(t['f1'].cbytes < t2['f1'].cbytes, "clevel not changed")
AssertionError: False is not true : clevel not changed

======================================================================
FAIL: test01 (test_ctable.copyMemoryTest)
Testing copy() with higher clevel
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/bcolz/bcolz/tests/test_ctable.py", line 1014, in test01
    self.assertTrue(t['f1'].cbytes > t2['f1'].cbytes, "clevel not changed")
AssertionError: False is not true : clevel not changed

======================================================================
FAIL: test02 (test_ctable.copyMemoryTest)
Testing copy() with lower clevel
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/bcolz/bcolz/tests/test_ctable.py", line 1025, in test02
    self.assertTrue(t['f1'].cbytes < t2['f1'].cbytes, "clevel not changed")
AssertionError: False is not true : clevel not changed

======================================================================
FAIL: test03 (test_ctable.copyMemoryTest)
Testing copy() with no shuffle
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/bcolz/bcolz/tests/test_ctable.py", line 1036, in test03
    self.assertTrue(t['f1'].cbytes < t2['f1'].cbytes, "clevel not changed")
AssertionError: False is not true : clevel not changed

======================================================================
FAIL: test01 (test_ctable.specialTest)
Testing __sizeof__() (big ctables)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/bcolz/bcolz/tests/test_ctable.py", line 1067, in test01
    "ctable does not seem to compress at all")
AssertionError: False is not true : ctable does not seem to compress at all

----------------------------------------------------------------------
Ran 1270 tests in 4.745s

FAILED (failures=18, skipped=160)
root@f0b7772b23f5:~/bcolz#

As I'm not sure if it's a problem in Blosc or in bcolz, so I'm tentatively filing it here.

@ginggs is the one who alerted me to the failure.

FrancescAlted commented 6 years ago

Yes, I can reproduce that. That's unfortunate. I think the issue is on too strict bcolz checks on filters. Will see if I can find some free time by the end of the week and release a new version with a fix for that.

estan commented 6 years ago

@FrancescAlted Thanks, that would be great. Hopefully we can then get a Debian package update of bcolz and Ubuntu feature freeze exception also for bcolz, to get into all into Ubuntu Bionic.

danstender commented 6 years ago

The Debian package would come up immediately. Thanks in advance.

FrancescAlted commented 6 years ago

I have updated the internal C-Blosc sources to 1.14.2 and also the tests so that the whole suite pass now. The issue was just when filters=bcolz.NOSHUFFLE. Not sure what was happening, but probably newer versions of C-Blosc uses different defaults for the blocksize. At any rate, the kind of data that is used in tests does not compress well with no shuffle, so I just disabled those tests.

estan commented 6 years ago

@FrancescAlted Many thanks!

@danstender When you have the time, it would be great if you could package bcolz 1.2, and try another upload of the c-blosc 1.14.2 package. Hopefully it will finally succeed now :)

danstender commented 6 years ago

1.1.2 is in containing the relevant patch (that first because the tarball was already committed to the Debian repo). But the latest release comes right after quickly.

estan commented 6 years ago

@danstender Alright, many thanks for the expediency. I believe 1.2 is what's required to get the fix for this issue (1.1.2 was tagged Feb 2, while this issue was only just fixed).

Once 1.2 is in unstable, I'll ask either @adconrad or @ginggs for a FFE for bcolz as well, to allow the migration of c-blosc 1.14.2 from -proposed to proceed.

estan commented 6 years ago

@danstender Nevermind, I misunderstood you. You patched the 1.1.2 package, that's excellent. Then I can ask for a FFE to sync that.

danstender commented 6 years ago

... anyway it's only temporary, 1.2 will come up behind very quick behind (today or maybe tomorrow)

estan commented 6 years ago

@danstender Alright, perhaps I better wait with requesting an Ubuntu FFE until 1.2 is in place then.

danstender commented 6 years ago

Let's bump that issue here, it will come up today. Another thing, Bcolz appears to not support big endian machines (https://bugs.debian.org/852307), is there a perspective for a fix or might they be excluded as just not being supported (this has been reported already as #329, maybe we'll continue this there ...)

ginggs commented 6 years ago

@estan @danstender No need for a FFe, just ping me after c-blosc and bcolz have been uploaded to Debian.

estan commented 6 years ago

@danstender Alright, great.

Yes, better continue the big endian discussion there. You'll have to ask @FrancescAlted if big endian was ever meant to be supported (I'm just the random guy trying to get c-blosc 1.14.2 into Ubuntu :) )

estan commented 6 years ago

@ginggs: Perfect. I feel so pampered by the Debian / Ubuntu packaging community lately :) You're doing an A class job.

danstender commented 6 years ago

@estan @ginggs It's in. Nice weekend everybody. Yours, the Debian Quick Action Team (please don't refer to that :-)

estan commented 6 years ago

@danstender 👌 And a nice weekend to you.

FrancescAlted commented 6 years ago

Let's bump that issue here, it will come up today. Another thing, Bcolz appears to not support big endian machines (https://bugs.debian.org/852307), is there a perspective for a fix or might they be excluded as just not being supported (this has been reported already as #329, maybe we'll continue this there ...)

@danstender I have tried to address this (see comments in #329), but no success so far. I think support for big endian architectures would take a bit more time, so for the time being I think it would make more sense to label bcolz as 'not support big-endian machines'. Hopefully this will happen, but I don't know when (of course, a PR would help accelerating this).