kiyo-masui / bitshuffle

Filter for improving compression of typed binary data.
Other
215 stars 76 forks source link

Big endian #56

Closed kiyo-masui closed 7 years ago

keszybz commented 7 years ago

Still fails.

py.test gives slightly nicer output. For example:

______________________________________________ TestAll.test_regression _______________________________________________

self = <bitshuffle.tests.test_regression.TestAll testMethod=test_regression>

    def test_regression(self):
        for version in VERSIONS:
            file_name = OUT_FILE_TEMPLATE % version
            f = h5py.File(file_name)
            g_orig = f["origional"]
            g_comp = f["compressed"]

            for dset_name in g_comp.keys():
>               assert np.all(g_comp[dset_name][:] == g_orig[dset_name][:])
E               AssertionError: assert <function all at 0x3fff787ab2f0>(array([b'\x0b... dtype='|S10') == array([b'\x89a... dtype='|S10')
E                +  where <function all at 0x3fff787ab2f0> = np.all
E                 Full diff:
E                 + array([b'\x89a\x10\x08l8\xe9\xa9\xa0\x8b',
E                 +        b'\xcf\xba\x1f\x83\x06Hr\x90;\xe2', b'*\xbbTL\xb4\x88I\x11\x0b\xd3',
E                 +        b'3\xd1\xeeN!w\x18P\xe3\xa4',
E                 +        b'\r\xb1\x88\xdc\xae\x8c\xd0G\xb5\xb8',
E                 +        b'\xd4UB\xc7\xce\x9f\xfe1\xb4$',
E                 +        b'd\x07\xa0\xbc\xd9\xae\xe8\x12@\xe0',
E                 +        b'\xd0\xcf\x8b\x1b\xf6\xf2*EHr', b'k\xfec\x1c\x1a<\x8d\x91\xbb\x7f',
E                 +        b'c\xa7+y\xabA\x0cmf\xef', b's\x03g\xbf\x16I\x01\xa7\xf0X',
E                 - array([b'\x0b\xf3\xd1\xd8oOT\xa2\x12N', b'&\xe0\x05=\x9bu\x17H\x02\x07',
E                 -        b'+\xaaB\xe3s\xf9\x7f\x8c-$', b'\xb0\x8d\x11;u1\x0b\xe2\xad\x1d',
E                 -        b'\xcc\x8bwr\x84\xee\x18\n\xc7%', b'T\xdd*2-\x11\x92\x88\xd0\xcb',
E                 -        b'\xf3]\xf8\xc1`\x12N\t\xdcG',
E                 -        b'\x91\x86\x08\x106\x1c\x97\x95\x05\xd1', b'\x90J)z\x95\xb2v\xb9a3',
E                 -        b'LZGB\n(\xaaH\x98\xe1', b'\xeed\xbe6\xf7u\xfd\xe9\xbc\xd6',
E                 -        b'\xc3\xc0\xf1X+\x1b\xf9]\\2',
E                 -        b'\x08\xb4\x1b\xd0\xdb\x7f\x8a\xefb\xc0',
E                 -        b'\xce\xc0\xe6\xfdh\x92\x80\xe5\x0f\x1a',
E                 -        b'\xc6\xe5\xd4\x9e\xd5\x820\xb6f\xf7',
E                 -        b'\xd6\x7f\xc68X<\xb1\x89\xdd\xfe', b'\x05\xe4Q\xfa#?\xc2K\xa6f',
E                 -        b'\x18\xa6\x00\xaam\xdcW!\xf0\xfa',
E                 -        b'\x1e\x80\x14\xcc\xca%G\xa2\xedJ', b'\x8a\xb89\xf00\xdd\xb9(\xcbQ',
E                 -        b'\xfc\xc3t\xb5\x9ac0\x94\x1e\xe6',
E                 -        b'\x99\x0c_\x83\xa3\xf4d\xda\xd3\xd4', b'\xb3QK3@\\=\xcf\xf1\xbf',
E                 -        b'\xe1\xf3\xebgR\xac\xda\xc2\x06\x16', b'\xd6M\xde\xb1s\xf6F\xc0BO',
E                 -        b'\xc1\x05\xb2g<\x93\x18\x1bZA',
E                 -        b'\x93\x14\xed\xb3\xfd\x86\xea\x9c\xcc(',
E                 -        b"w\xf95\x06\xf9\xcc\xc7'\xb7z", b'\xb2_yJL\xcbj\xb5\x1f,',
E                 -        b'X\xc1}\xe3rs\xde\xe50\xda', b'\xa5Z\x9b\x10\xa6)\x0e\xaba\x8e',
E                 -        b"Y\xcf}\xad\xfdm{\xca'\xa1", b'\x1f\x8e\xba\xc5\xeaB]\xcbm\xc9',
E                 -        b'\x99\x82\x06\x03MQRU3\x7f', b'\x83>L\xc3\xcd\xc1\t\x17\x11E',
E                 -        b'\xc4E`\xa9\x03\x8b\xb0\xa80\xed', b'A\xb4\xf1\xb2\xf6H\xd4h\xd9:',
E                 -        b'\xc0\xa6\xbc\xe4\xe9\xf3\xab\x80\x80\xa4',
E                 -        b'\xa1\xe7\xa70\xbao<\xd7\x00\x9a', b"\x13f\xaeZ['m\x89\x03v",
E                 -        b'\xd4H,\xa8t\x0e2\x1b"\x99', b'ud\xe2?ZX\x10\xbc\x06\xfb',
E                 -        b'\r(\xf6\x92\xd7\x1c\xc2\x7f]\xdd',
E                 -        b'\xde\x9c\xb0\xd5\xb2o\x9a\x98/\x83',
E                 -        b'\x01}\xdd\xd7\xde\xd5\xb3\xb9%\xf4',
E                 -        b'\x0f\xdeA\x88\xfe\x91\x1d\x8f\xd5J',
E                 -        b'F\xd5\xe0\x97\x16\x8f\xe7\x0b3\xf6',
E                 -        b'\xc8\xf9F|\x01W\xf0\r\x8dr', b'\xe1\xc87\xc9\xedi\xc2Hc\xea',
E                 -        b'Zu\x95\x1f\xdc\xa1j\x16\x19\xdb', b':R\x1a\xc3\x03\xb5Iy\xdb\xd6',
E                 -        b'i\xbd\xb8\xa5\x17f\xf7]y\x10', b'/\x89\xc8\xc6\x01rl\xf8hw',
E                 -        b'\xd8\xa7j\x00\x8d\xf6u~\x19\xf5', b'\xach\x07\x98\xadP1\xbb\xbe@',
E                 -        b'\x9c\xd319\x8cS\x91\x13\x9aL', b'fc6%\x19\xe1"\xb4D\xac',
E                 -        b'\x8a\xc0\x05\xdaq\xbe?\x9e\x13\xfd',
E                 -        b'\xa5\x10\xfc\xc9\x1c\xa0\x85\r\xebe',
E                 -        b'\xed\x97\xc8\xe2q;/\x99\xe5\n', b'cR\x08\xbf~#9d+\x12',
E                 -        b'!\xfek\xd2\xac\xb3\xb8\x85\x9b\xcc',
E                 -        b'\xf5\xedT\xf1\xd4_9\xf6K\xc7', b'#\x86p.\xd9\xfd\x06R\x97}',
E                 -        b']\x80\xc8\xc99\xd3\xf4\x1c\x97\x11',
E                 -        b'}\xd1\xf6\xc7\x11;\\\x7f\xfb\xce', b'kt>\xc3\xe0\xf33\xbf\\d',
E                 -        b'\xef%\xf9\xd0\x98\xfd\xecO\xab\x83',
E                 -        b'\xb0\xb1vi\x02_\xf0\x11\xf4\xb5', b'`\xc7\x18{\x18g\x1ajR{',
E                 -        b'\xc2\xd7\xcd<\xb8\xbbdQCo', b'hsd-\xe9*I\xc8\xdb\xba',
E                 -        b'!\x03\x07\x9f\xd3\x1e\x80\xe11\xee',
E                 -        b'\x82pt\xaf\xd1~\xea\xaf\x8fI', b'\xf3\xba\x94B\xdcA\xdbC\xc7\x95',
E                 -        b'2\x05\xd3Yv\x0f\xa9\x98\xf2\x99',
E                 -        b'\xbb\xab\xcf\x9f\xd3\xe3\xe5\x7f\xaeW',
E                 -        b'\x04\x00\xd0\n\xbbg\x0c\x8d\x0f\x18', b'\xc9\x13\xa9f@\xbf^[T;',
E                 -        b'\x978\xc5\xabB\xfc\x17wN\xd4',
E                 -        b"\xbf\x00\xdbG\xed\x1e\xe6\xf4\xc1'",
E                 -        b'\xbd\x01\xe2\xce\x86\x8a\xa2\x1f\xf3\xd5',
E                 -        b'\xf0\x88\x18\x0c\x9a\x8ad\xab&\x85',
E                 -        b'\x07\xdd\x9e\x85\xab\x92\x7f+\xc9x', b'dMs\x92F\xfd{\x1a\xf7\xbf',
E                 -        b'\xfb\xdc\xed\xa9\xc0s\x98_S\x14',
E                 -        b'q\xc0~\xb0\xbf\x06/\r\x14\x11', b"'\xfb\x8c\xbaP,c\xa0A\xa6",
E                 -        b'\x99b\x19\xf0y\x17J\xb5Q\x14', b'\x99egT\xe5N\x9b\xa0\x97S',
E                 -        b'\xb0\x13\xda\xf3\xe0\xcc\xdc\xa9I\xbd',
E                 -        b'f\x0b\x92\xe3\xa0\xcc\xaf>\xdd\x0e',
E                 -        b'\xfa\xf39\xbf?\xb0\x8c!\xdf\x9c', b'\xf4oP4\x0b\xfb\x8f\x11\xfb ',
E                 -        b'\x81\xbb\x03b{\x8b]szf', b'qW6\xa6\x17x\x0e\x8d\x88\xe8',
E                 -        b"'\x89\x0e\xaa\xae\xabI\xbb\x98\xfb",
E                 -        b'j\xebji\xf2\x8b\xcc\xbad\xe2', b'\xb2@\x01\x82\x8c\x9eT\xc6Y\xaf',
E                 -        b'\x0b2\xd2\xb4\xde+\x9b\xe4\xfb\xa3',
E                 ?             ^^   ^   -   -----    ^   ^  ^
E                 +        b'\x10-\xd8\x0b\xdb\xfeQ\xf7F\x03',
E                 ?            + ^   ^  +       + ^   ^^  ^
E                 -        b'\xfa\xf7\xdc\x17\x07\xb7\xc64\x08t',
E                 -        b'\xbb\x06\xac\xe5\x1e{\x1b\x86\xbaU',
E                 -        b'\x85\xd0\xc1x\xd0$\xa5v\xc6\x8f',
E                 -        b'\xbf\xa4)\x849\xfb\xa8\x7f\xe8\xe8',
E                 -        b'\xe6\x1f\xf2$kC\x91\x8e\xcbG', b"\x94\xb8S\xe8\xb3'\xbeZ\xacH",
E                 -        b'U\x8e\x8a\xbf\xea\xb034O\x97',
E                 -        b'O\r\xfa\xc6\xc8\xc6\xac\xfex\xe0', b"$-'Z\x86\xd6\x04Y[-",
E                 -        b'r\xbe\xf0\x18\xb4\xbf\x00\x9bH\xca',
E                 -        b'\x03~\x7f\xf4\x03\xe7\x7f\xeb&4',
E                 -        b'\xa7\xbf\xaa\xe4\xb6\x02\x8f\xa0\xb05',
E                 -        b'"e\x9c\x03\x94\xf5)\xfb\xbeF',
E                 -        b"\xfd\xd1\xe6\xf5D\xc4'\x88\x1a\x97", b'\xd9smw\xd4\xf3.:\x04\xc0',
E                 -        b'\xc9\x95\xcf\xc2j6R\x00\xcdW', b'\xb1s\x93\xbf\xed\xafX5\xcd\x98',
E                 -        b'z\x9e\xd8\xad\xa5c\t~V\x1f', b'\xcb\xbdQ\xd7\xees\xd1\x17T\xdb',
E                 -        b'\xd0D;\x93\x80U\xa1\xfa\xd0e', b'w\xebZ\x84D:A\xd9\xde}',
E                 -        b'\x966iZ\xf4\xe5\x07"\xc6\x8e', b'\x17E*x\x05\xe7\x81\x86\xde\xb0',
E                 -        b'\xa6\xd9\xf9g\xd9\xab\xab"E)', b'\xd1\xefQp\xe4nW\xb8\xd6\xe5',
E                 -        b'\xfct)\xb0\xd4\x87\xe2\x18\xbe!', b'vv\x88y\xedC\xbc\xbaK\xbc',
E                 -        b'\xc3@\x94\x19\xd8\xd5\xb7$\xc4p',
E                 -        b'\x8e\x02\xb2\xab\x11<-@\xa4\xe2',
E                 -        b'\xbd\x1a\xe1Nz\x01\x97\x849\xda', b'\x99\xbb\xb7!H\xcduj\xcfy',
E                 -        b'C\xce\x96\x8b\xfc\xac\xd5\x85\x12\x9b',
E                 -        b'\x19\x99\x17\x1b\x06C\xf3t\xe8\xb8', b'|\x93\xa6bN{\x01\xc4`\x8a',
E                 -        b'\xfd\x90\xd7u\xcc\xea%W-\x93',
E                 -        b'~\x82,\xa3\xc4\xdc\xab\x82\x0c\xc9', b'p\xa8U\x0b+:g\xd9\xdf\x8a',
E                 -        b'$\x0f\xefK\xcf\x07\x1d\xec?}',
E                 -        b'\x1e\xd3"\x98\xab\xcb\xf3_\x0c\xa6',
E                 -        b'\x1b\xb4\xfb\x12F\x87|\x82-\x93', b'\x18\x07~\xf0]\\\x12s\xc5.',
E                 -        b'-U\xc6<\x15\xf3\x8f\r\xcdw', b'C\x95\x8e\x82\xf1n\x0fq\xa9/',
E                 -        b'\xa1\xf3fL\xa1\xfb\xfb1^\x18',
E                 -        b'\xdd\x1aP\xb1\x03\x82\x80\x1c\x10\xc2',
E                 -        b'\xc5g\x99\xacT\xa6:\xe9gO', b'\x9f\x19\xbd\xc2k\x97\xa3\xdd\xa6N',
E                 -        b'\x00;~\x819\xfe\x80\xaa\x18\xab', b'RF8\x02\\(\xcf\x0c\xf7\xd8',
E                 -        b'z\x181~\xc5\xc5\xad\x1d:b', b'P\xafv\x08U\xbf\x14\xbfA\x81',
E                 -        b'\x83\xec\x05\xc6R\x15.\x06\xc5J', b'\xed\xe6\x04|\xa3\xcckeR\x91',
E                 -        b'\xb8\xd3\xe4\x97\x8a\xb7%\xadu\xd6',
E                 -        b'\x1f\x90:\x89M\x1e\xb6\x8a\xef5', b'>a>\xa4\xaa=\xa3\xc9\x19.',
E                 -        b'\x16\x00\xdeaqjI_@f', b'/f\xb7\xef\xe3\xecz\x98K\xfb',
E                 -        b'*l\x94\xe8>\x0c\xca\xf7\\\x17',
E                 -        b'<\xfeW\xcf\xbe\x82\xac\xae\x90+', b'\x13D>\x8f\x10Cf\xa5\xda\xb3',
E                 -        b'3F\x95W)\xc6\x05\xed<\xa6', b'$\x19\x06\x8ev\xf60YX\xc4',
E                 -        b'\x81\xf9e\xe4:;d\x93E\x07', b'2`\x9cv\xbb\xbc\x13\xda\xdd\x0c',
E                 -        b'\x9d\xfae\xb7\xf6\xe7\xbc\xbaw\xaa',
E                 -        b'\xab\xdf\xe1\x8f\x0cv\xfb}\xbf\xb7',
E                 -        b'\xda\x07\xab_\xdd\xa6\x04\xf5\x14\n',
E                 -        b']\xb1\xa6*\xa5\xcc]\xe7\xf8y', b'\x0eI_\x84\xea\xe7i\xf9\x9f^',
E                 -        b'\xcf\xe5\xb8j\x00\x1e\xd6M;\xc7',
E                 -        b'\xb4\xfa\xe7\x8d\xcdI\xc3}`\x1c', b'<\xba\x90\x15\xba5xS\xf4\x1f',
E                 -        b"M\xfa'\xbb-\xb0EZ\x96'", b")G\xc8T'\xbd\xd1P\x8fw",
E                 -        b'\x9ap\x98\xae\xaf\xd8\xafS\x11\xb6',
E                 -        b'\x94(K\x0e\x16S\x84\xc9\xbf\x17', b'\x95f\xbbj$\xae\xaa\xbe\x86}',
E                 -        b'\xf6\x96\x87\n)\x9ev\xf4\xa9t',
E                 -        b'\x85a\xa9`\x03\x9e.\xd1\xd3\xf4', b'\xb3\t\xec:\xbeQ\xf8`\x82<',
E                 -        b'>m\xd4\x0es?%\x98\xd03', b'\x00\xfa\xcb\xba\xa59\x13\xb6hU',
E                 -        b'\xb5\xf07\xcar\xe8v\xd3s\x19', b'Y\xf9[\x84; \xc9\xf3\xa0)',
E                 -        b'\x01\xeas\x88\xf9\xed\xbf\x95\n\xbf',
E                 -        b'2mHp\xea\xd4\x0c\t\x1e\xa3', b'\xca\x86\x05\xbb\x8dE5\xd9-l',
E                 -        b'\xa7\ng\xa5\x91\x89QG=\xed',
E                 -        b'\x02\x9a\xd6\x1dP\xd5\xbb\xe9\xe1q',
E                 -        b'\xfeI"q\xfd/\xb1\x18\x0c\xe6', b'\xa9Gl\x1c\xbc\xdfF\xc7\xcc\xdf',
E                 -        b'\x85r\x8e\xbc\xcc\x05\xd9\x9bkz', b'\xb2?5\xeb\xc3\x02w\xdbW\xd5',
E                 -        b'\xbd\xd8\xd3\xa3$&\x16\x81m\x98', b'eMf\x9e8\xe0\xa1\xd3\xfdr',
E                 -        b'\xf8\x93}\xc93\xe18\xe9l\x8d', b'\x7f\xba\xa09o\x07o_\x17\x80',
E                 -        b'\xcc\xe3\x16\r\x86b\xdd?a\xc4', b';\xd4\xfcJ{\x0eK\x83\x0bd',
E                 -        b'\x1c\x06!\x03\xe3\x01v44h', b'A\xc2\x99O\xaa\xb8\xc7\x00\xd3\x9c',
E                 -        b"\xf0\x10.\x0e'\x1f\xc49r%", b'\xda\xe9^D\xf4\xde>\x92\x8d\x8f',
E                 -        b'\xaeAy\xa6H\xb9\x1521 ', b't\xb4\xde\x07\x8b\xc7Pg\xe0\xcb',
E                 -        b'\x1c\xb5\xee\x02\xa8\x85\xf8\x93\xf9I',
E                 -        b'\x16:\xa3s \r\xaf\xf4\xcb{', b'\x15@\xc5\xc4\xc0=@T\x1c\xe8',
E                 -        b'z\xf2\xb7\xab\xb0D\x05\xd6N\xef', b'\x03"\x86"1?\xcf\x15\xf3\xa8',
E                 -        b'\xb6\x17`w\x80\xb2\xc9a\xee\x8a',
E                 -        b'\x1b\x03\n4\x7f"\xe7\xea\x19\x0b',
E                 -        b'I\xde#o\xae\xb5\x82\xcd\x95=', b'dJi\r\x8a\x86C15&',
E                 -        b'\x87\xe9}\xbf#\x9d\xb1\xe5\x92\xd3',
E                 -        b'\xfb\xb4\xc93N\x852|\xee\x94', b'\xc9:\xecN8/E\x92\x0bZ'],
E                 +        b'\xc3\x03\x8f\x1a\xd4\xd8\x9f\xba:L', b'w&}l\xef\xae\xbf\x97=k',
E                 +        b'2Z\xe2BP\x14U\x12\x19\x87', b'\tR\x94^\xa9Mn\x9d\x86\xcc',
E                 +        b'\x87\xcf\xd7\xe6J5[C`h', b'\xcd\x8a\xd2\xcc\x02:\xbc\xf3\x8f\xfd',
E                 +        b'\x990\xfa\xc1\xc5/&[\xcb+', b'?\xc3.\xadY\xc6\x0c)xg',
E                 +        b'Q\x1d\x9c\x0f\x0c\xbb\x9d\x14\xd3\x8a', b'x\x01(3S\xa4\xe2E\xb7R',
E                 +        b'\x18e\x00U\xb6;\xea\x84\x0f_', b"\xa0'\x8a_\xc4\xfcC\xd2ef",
E                 +        b'\x9a\xf3\xbe\xb5\xbf\xb6\xdeS\xe4\x85',
E                 +        b'\xa5Z\xd9\x08e\x94p\xd5\x86q', b'\x1a\x83\xbe\xc7N\xce{\xa7\x0c[',
E                 +        b'M\xfa\x9eR2\xd3V\xad\xf84', b'\xee\x9f\xac`\x9f3\xe3\xe4\xed^',
E                 +        b'\xc9(\xb7\xcd\xbfaW93\x14', b'\x83\xa0M\xe6<\xc9\x18\xd8Z\x82',
E                 +        b'k\xb2{\x8d\xceob\x03B\xf2', b'\xc8fuZ\xda\xe4\xb6\x91\xc0n',
E                 +        b'\x85\xe7\xe5\x0c]\xf6<\xeb\x00Y', b"\x03e='\x97\xcf\xd5\x01\x01%",
E                 +        b'\x82-\x8fMo\x12+\x16\x9b\\',
E                 +        b'#\xa2\x06\x95\xc0\xd1\r\x15\x0c\xb7',
E                 +        b'\xc1|2\xc3\xb3\x83\x90\xe8\x88\xa2',
E                 +        b'\x99A`\xc0\xb2\x8aJ\xaa\xcc\xfe', b'\xf8q]\xa3WB\xba\xd3\xb6\x93',
E                 +        b'\x13\x9fb>\x80\xea\x0f\xb0\xb1N',
E                 +        b'b\xab\x07\xe9h\xf1\xe7\xd0\xcco',
E                 +        b'\xf0{\x82\x11\x7f\x89\xb8\xf1\xabR',
E                 +        b'\x80\xbe\xbb\xeb{\xab\xcd\x9d\xa4/',
E                 +        b'{9\r\xabM\xf6Y\x19\xf4\xc1', b'\xb0\x14oI\xeb8C\xfe\xba\xbb',
E                 +        b'\xae&G\xfcZ\x1a\x08=`\xdf', b'+\x124\x15.pL\xd8D\x99',
E                 +        b'9\xcb\x8c\x9c1\xca\x89\xc8Y2',
E                 +        b'5\x16\xe0\x19\xb5\n\x8c\xdd}\x02',
E                 +        b'\x1b\xe5V\x00\xb1o\xae~\x98\xaf',
E                 +        b'\xf4\x91\x13c\x80N6\x1f\x16\xee',
E                 +        b'\x96\xbd\x1d\xa5\xe8f\xef\xba\x9e\x08',
E                 +        b'\\JX\xc3\xc0\xad\x92\x9e\xdbk', b'Z\xae\xa9\xf8;\x85Vh\x98\xdb',
E                 +        b'\x87\x13\xec\x93\xb7\x96C\x12\xc6W',
E                 +        b'\xc4a\x0et\x9b\xbf`J\xe9\xbe', b'\xaf\xb7*\x8f+\xfa\x9co\xd2\xe3',
E                 +        b'\x84\x7f\xd6K5\xcd\x1d\xa1\xd93', b'\xc6J\x10\xfd~\xc4\x9c&\xd4H',
E                 +        b'\xb7\xe9\x13G\x8e\xdc\xf4\x99\xa7P',
E                 +        b'\xa5\x08?\x938\x05\xa1\xb0\xd7\xa6',
E                 +        b'Q\x03\xa0[\x8e}\xfcy\xc8\xbf', b'f\xc6l\xa4\x98\x87D-"5',
E                 +        b'\x16\xce&\xb4\x97T\x92\x13\xdb]',
E                 +        b'C\xeb\xb3<\x1d\xdd&\x8a\xc2\xf6',
E                 +        b'\x06\xe3\x18\xde\x18\xe6XVJ\xde',
E                 +        b'\r\x8dn\x96@\xfa\x0f\x88/\xad',
E                 +        b'\xf7\xa4\x9f\x0b\x19\xbf7\xf2\xd5\xc1',
E                 +        b'\xd6.|\xc3\x07\xcf\xcc\xfd:&', b'\xbe\x8bo\xe3\x88\xdc:\xfe\xdfs',
E                 +        b'\xba\x01\x13\x93\x9c\xcb/8\xe9\x88',
E                 +        b'\xe9\x1c\xa3\xd5B?\xe8\xeer+', b'\x93\xc8\x95f\x02\xfdz\xda*\xdc',
E                 +        b' \x00\x0bP\xdd\xe60\xb1\xf0\x18',
E                 +        b'\xdd\xd5\xf3\xf9\xcb\xc7\xa7\xfeu\xea',
E                 +        b'L\xa0\xcb\x9an\xf0\x95\x19O\x99', b'\xcf])B;\x82\xdb\xc2\xe3\xa9',
E                 +        b'A\x0e.\xf5\x8b~W\xf5\xf1\x92',
E                 +        b'\x84\xc0\xe0\xf9\xcbx\x01\x87\x8cw',
E                 +        b'\xe4\xdf1]\n4\xc6\x05\x82e', b'\x8e\x03~\r\xfd`\xf4\xb0(\x88',
E                 +        b'\xdf;\xb7\x95\x03\xce\x19\xfa\xca(',
E                 +        b'&\xb2\xceIb\xbf\xdeX\xef\xfd',
E                 +        b'\xe0\xbby\xa1\xd5I\xfe\xd4\x93\x1e', b'\x0f\x11\x180YQ&\xd5d\xa1',
E                 +        b'\xbd\x80GsaQE\xf8\xcf\xab', b'\xfd\x00\xdb\xe2\xb7xg/\x83\xe4',
E                 +        b'\x8e\xeale\xe8\x1ep\xb1\x11\x17',
E                 +        b'\x81\xdd\xc0F\xde\xd1\xba\xce^f',
E                 +        b'/\xf6\n,\xd0\xdf\xf1\x88\xdf\x04',
E                 +        b'_\xcf\x9c\xfd\xfc\r1\x84\xfb9', b'f\xd0I\xc7\x053\xf5|\xbbp',
E                 +        b'\r\xc8[\xcf\x073;\x95\x92\xbd',
E                 +        b'\x99\xa6\xe6*\xa7r\xd9\x05\xe9\xca',
E                 +        b'\x99F\x98\x0f\x9e\xe8R\xad\x8a(',
E                 +        b'\xfd%\x94!\x9c\xdf\x15\xfe\x17\x17',
E                 +        b'\xa1\x0b\x83\x1e\x0b$\xa5nc\xf1', b'\xdd`5\xa7x\xde\xd8a]\xaa',
E                 +        b'_\xef;\xe8\xe0\xedc,\x10.', b"\xd0LK-{\xd4\xd9'\xdf\xc5",
E                 +        b'M\x02\x80A1y*c\x9a\xf5', b'V\xd7V\x96O\xd13]&G',
E                 +        b'\xe4\x91pUu\xd5\x92\xdd\x19\xdf', b"\xe5\xfdU'm@\xf1\x05\r\xac",
E                 +        b'\xc0~\xfe/\xc0\xe7\xfe\xd7d,', b'N}\x0f\x18-\xfd\x00\xd9\x12S',
E                 +        b'$\xb4\xe4Zak \x9a\xda\xb4', b'\xf2\xb0_c\x13c5\x7f\x1e\x07',
E                 +        b'\xaaqQ\xfdW\r\xcc,\xf2\xe9', b')\x1d\xca\x17\xcd\xe4}Z5\x12',
E                 +        b'g\xf8O$\xd6\xc2\x89q\xd3\xe2',
E                 +        b'\x0b"\xdc\xc9\x01\xaa\x85_\x0b\xa6',
E                 +        b'\xd3\xbd\x8a\xebw\xce\x8b\xe8*\xdb',
E                 +        b'^y\x1b\xb5\xa5\xc6\x90~j\xf8',
E                 +        b'\x8d\xce\xc9\xfd\xb7\xf5\x1a\xac\xb3\x19',
E                 +        b'\x93\xa9\xf3CVlJ\x00\xb3\xea', b'\x9b\xce\xb6\xee+\xcft\\ \x03',
E                 +        b'\xbf\x8bg\xaf"#\xe4\x11X\xe9', b'D\xa69\xc0)\xaf\x94\xdf}b',
E                 +        b'\xc3\x02)\x98\x1b\xab\xed$#\x0e', b'nn\x11\x9e\xb7\xc2=]\xd2=',
E                 +        b'?.\x94\r+\xe1G\x18}\x84', b"\x8b\xf7\x8a\x0e'v\xea\x1dk\xa7",
E                 +        b'e\x9b\x9f\xe6\x9b\xd5\xd5D\xa2\x94',
E                 +        b'\xe8\xa2T\x1e\xa0\xe7\x81a{\r', b'il\x96Z/\xa7\xe0Dcq',
E                 +        b'\xee\xd7Z!"\\\x82\x9b{\xbe', b'~A4\xc5#;\xd5A0\x93',
E                 +        b'\xbf\t\xeb\xae3W\xa4\xea\xb4\xc9', b'>\xc9eFr\xde\x80#\x06Q',
E                 +        b'\x98\x99\xe8\xd8`\xc2\xcf.\x17\x1d', b'\xc2si\xd1?5\xab\xa1H\xd9',
E                 +        b'\x99\xdd\xed\x84\x12\xb3\xaeV\xf3\x9e',
E                 +        b'\xbdX\x87r^\x80\xe9!\x9c[', b'q@M\xd5\x88<\xb4\x02%G',
E                 +        b'\x85\xcff2\x85\xdf\xdf\x8cz\x18',
E                 +        b'\xc2\xa9qA\x8fv\xf0\x8e\x95\xf4',
E                 +        b'\xb4\xaac<\xa8\xcf\xf1\xb0\xb3\xee',
E                 +        b'\x18\xe0~\x0f\xba:H\xce\xa3t', b'\xd8-\xdfHb\xe1>A\xb4\xc9',
E                 +        b'x\xcbD\x19\xd5\xd3\xcf\xfa0e',
E                 +        b'$\xf0\xf7\xd2\xf3\xe0\xb87\xfc\xbe',
E                 +        b'\x0e\x15\xaa\xd0\xd4\\\xe6\x9b\xfbQ', b'\xc17\xa0cJ\xa8t`\xa3R',
E                 +        b'\n\xf5n\x10\xaa\xfd(\xfd\x82\x81',
E                 +        b'^\x18\x8c~\xa3\xa3\xb5\xb8\\F', b'Jb\x1c@:\x14\xf30\xef\x1b',
E                 +        b'\x00\xdc~\x81\x9c\x7f\x01U\x18\xd5',
E                 +        b'\xf9\x98\xbdC\xd6\xe9\xc5\xbber',
E                 +        b'\xa3\xe6\x995*e\\\x97\xe6\xf2', b'\xbbX\n\x8d\xc0A\x018\x08C',
E                 +        b'<\x7f\xea\xf3}A5u\t\xd4', b'T6)\x17|0S\xef:\xe8',
E                 +        b'\xf4f\xed\xf7\xc77^\x19\xd2\xdf', b'h\x00{\x86\x8eV\x92\xfa\x02f',
E                 +        b'|\x86|%U\xbc\xc5\x93\x98t', b'\xf8\t\\\x91\xb2xmQ\xf7\xac',
E                 +        b"\x1d\xcb'\xe9Q\xed\xa4\xb5\xaek", b'\xb7g >\xc53\xd6\xa6J\x89',
E                 +        b'[\xe0\xd5\xfa\xbbe \xaf(P', b'\xd5\xfb\x87\xf10n\xdf\xbe\xfd\xed',
E                 +        b'\xb9_\xa6\xedo\xe7=]\xeeU', b'L\x069n\xdd=\xc8[\xbb0',
E                 +        b"\x81\x9f\xa6'\\\xdc&\xc9\xa2\xe0", b'$\x98`qno\x0c\x9a\x1a#',
E                 +        b'\xccb\xa9\xea\x94c\xa0\xb7<e', b'\xc8"|\xf1\x08\xc2f\xa5[\xcd',
E                 +        b'Y\x0e\x19u\xf5\x1b\xf5\xca\x88m',
E                 +        b'\x94\xe2\x13*\xe4\xbd\x8b\n\xf1\xee',
E                 +        b'\xb2_\xe4\xdd\xb4\r\xa2Zi\xe4', b'<]\t\xa8]\xac\x1e\xca/\xf8',
E                 +        b'-_\xe7\xb1\xb3\x92\xc3\xbe\x068',
E                 +        b'\xf3\xa7\x1dV\x00xk\xb2\xdc\xe3', b'p\x92\xfa!W\xe7\x96\x9f\xf9z',
E                 +        b'\xba\x8deT\xa53\xba\xe7\x1f\x9e',
E                 +        b'\xad\x0f\xecSN\x17n\xcb\xce\x98',
E                 +        b'\x00_\xd3]\xa5\x9c\xc8m\x16\xaa',
E                 +        b'|\xb6+p\xce\xfc\xa4\x19\x0b\xcc', b'\xcd\x907\\}\x8a\x1f\x06A<',
E                 +        b'\xa1\x86\x95\x06\xc0yt\x8b\xcb/', b'oi\xe1P\x94yn/\x95.',
E                 +        b'\xa9f\xddV$uU}a\xbe', b')\x14\xd2ph\xca!\x93\xfd\xe8',
E                 +        b'\x95\xe268=\xfbb\xe33\xfb', b'\x7f\x92D\x8e\xbf\xf4\x8d\x180g',
E                 +        b'@Yk\xb8\n\xab\xdd\x97\x87\x8e',
E                 +        b'\xe5P\xe6\xa5\x89\x91\x8a\xe2\xbc\xb7',
E                 +        b'Sa\xa0\xdd\xb1\xa2\xac\x9b\xb46', b'L\xb6\x12\x0eW+0\x90x\xc5',
E                 +        b'\x80W\xce\x11\x9f\xb7\xfd\xa9P\xfd',
E                 +        b'\x9a\x9f\xda!\xdc\x04\x93\xcf\x05\x94',
E                 +        b'\xdc+?R\xdep\xd2\xc1\xd0&', b'3\xc7h\xb0aF\xbb\xfc\x86#',
E                 +        b'\xfe]\x05\x9c\xf6\xe0\xf6\xfa\xe8\x01',
E                 +        b'\x1f\xc9\xbe\x93\xcc\x87\x1c\x976\xb1',
E                 +        b'\xa6\xb2fy\x1c\x07\x85\xcb\xbfN',
E                 +        b'\xbd\x1b\xcb\xc5$dh\x81\xb6\x19',
E                 +        b'M\xfc\xac\xd7\xc3@\xee\xdb\xea\xab', b'\xa1Nq=3\xa0\x9b\xd9\xd6^',
E                 +        b'h\\\xc5\xce\x04\xb0\xf5/\xd3\xde',
E                 +        b'8\xadw@\x15\xa1\x1f\xc9\x9f\x92',
E                 +        b'.-{\xe0\xd1\xe3\n\xe6\x07\xd3',
E                 +        b'u\x82\x9ee\x12\x9d\xa8L\x8c\x04', b'[\x97z"/{|I\xb1\xf1',
E                 +        b'\x0f\x08tp\xe4\xf8#\x9cN\xa4', b'\x82C\x99\xf2U\x1d\xe3\x00\xcb9',
E                 +        b'8`\x84\xc0\xc7\x80n,,\x16',
E                 +        b'\xe1\x97\xbe\xfd\xc4\xb9\x8d\xa7I\xcb',
E                 +        b'&R\x96\xb0Qa\xc2\x8c\xacd', b'\x92{\xc4\xf6u\xadA\xb3\xa9\xbc',
E                 +        b'\xd8\xc0P,\xfeD\xe7W\x98\xd0', b'm\xe8\x06\xee\x01M\x93\x86wQ',
E                 +        b'\xc0DaD\x8c\xfc\xf3\xa8\xcf\x15', b'^O\xed\xd5\r"\xa0kr\xf7',
E                 +        b'\xa8\x02\xa3#\x03\xbc\x02*8\x17', b'\xfb\xb4\xc93N\x852|\xee\x94',
E                 +        b'\xc9:\xecN8/E\x92\x0bZ'],
E                 dtype='|S10'))

tests/test_regression.py:35: AssertionError
keszybz commented 7 years ago

Nose:

$ nosetests-3.5 tests/
....F..............F.......F........F.........F..F..........F
======================================================================
FAIL: test_trans_bit_elem_scal (bitshuffle.tests.test_ext.TestOddLengths)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/fedora/bitshuffle/build/lib.linux-ppc64-3.5/bitshuffle/tests/test_ext.py", line 486, in tearDown
    self.assertTrue(np.all(out == ans))
AssertionError: False is not true

======================================================================
FAIL: test_03a_trans_bit_byte (bitshuffle.tests.test_ext.TestProfile)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/fedora/bitshuffle/build/lib.linux-ppc64-3.5/bitshuffle/tests/test_ext.py", line 70, in tearDown
    self.assertTrue(np.all(ans == out.view(np.uint8)))
AssertionError: False is not true

======================================================================
FAIL: test_04e_trans_bit_elem_64 (bitshuffle.tests.test_ext.TestProfile)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/fedora/bitshuffle/build/lib.linux-ppc64-3.5/bitshuffle/tests/test_ext.py", line 70, in tearDown
    self.assertTrue(np.all(ans == out.view(np.uint8)))
AssertionError: False is not true

======================================================================
FAIL: test_06g_untrans_bit_elem_64 (bitshuffle.tests.test_ext.TestProfile)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/fedora/bitshuffle/build/lib.linux-ppc64-3.5/bitshuffle/tests/test_ext.py", line 73, in tearDown
    self.assertTrue(np.all(ans == out.view(np.uint8)))
AssertionError: False is not true

======================================================================
FAIL: test_09a_trans_bit_elem_scal_64 (bitshuffle.tests.test_ext.TestProfile)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/fedora/bitshuffle/build/lib.linux-ppc64-3.5/bitshuffle/tests/test_ext.py", line 70, in tearDown
    self.assertTrue(np.all(ans == out.view(np.uint8)))
AssertionError: False is not true

======================================================================
FAIL: test_09d_untrans_bit_elem_scal_64 (bitshuffle.tests.test_ext.TestProfile)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/fedora/bitshuffle/build/lib.linux-ppc64-3.5/bitshuffle/tests/test_ext.py", line 73, in tearDown
    self.assertTrue(np.all(ans == out.view(np.uint8)))
AssertionError: False is not true

======================================================================
FAIL: test_regression (bitshuffle.tests.test_regression.TestAll)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/fedora/bitshuffle/build/lib.linux-ppc64-3.5/bitshuffle/tests/test_regression.py", line 36, in test_regression
    == g_orig[dset_name][:]))
AssertionError: False is not true

----------------------------------------------------------------------
Ran 61 tests in 2.139s

FAILED (failures=7)
[fedora@puiterwijk---zbyszek-ppc bitshuffle]$ PYTHONPATH=. nosetests-3.5 tests/
....F..............F.......F........F.........F..F..........F
======================================================================
FAIL: test_trans_bit_elem_scal (bitshuffle.tests.test_ext.TestOddLengths)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/fedora/bitshuffle/build/lib.linux-ppc64-3.5/bitshuffle/tests/test_ext.py", line 486, in tearDown
    self.assertTrue(np.all(out == ans))
AssertionError: False is not true

======================================================================
FAIL: test_03a_trans_bit_byte (bitshuffle.tests.test_ext.TestProfile)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/fedora/bitshuffle/build/lib.linux-ppc64-3.5/bitshuffle/tests/test_ext.py", line 70, in tearDown
    self.assertTrue(np.all(ans == out.view(np.uint8)))
AssertionError: False is not true

======================================================================
FAIL: test_04e_trans_bit_elem_64 (bitshuffle.tests.test_ext.TestProfile)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/fedora/bitshuffle/build/lib.linux-ppc64-3.5/bitshuffle/tests/test_ext.py", line 70, in tearDown
    self.assertTrue(np.all(ans == out.view(np.uint8)))
AssertionError: False is not true

======================================================================
FAIL: test_06g_untrans_bit_elem_64 (bitshuffle.tests.test_ext.TestProfile)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/fedora/bitshuffle/build/lib.linux-ppc64-3.5/bitshuffle/tests/test_ext.py", line 73, in tearDown
    self.assertTrue(np.all(ans == out.view(np.uint8)))
AssertionError: False is not true

======================================================================
FAIL: test_09a_trans_bit_elem_scal_64 (bitshuffle.tests.test_ext.TestProfile)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/fedora/bitshuffle/build/lib.linux-ppc64-3.5/bitshuffle/tests/test_ext.py", line 70, in tearDown
    self.assertTrue(np.all(ans == out.view(np.uint8)))
AssertionError: False is not true

======================================================================
FAIL: test_09d_untrans_bit_elem_scal_64 (bitshuffle.tests.test_ext.TestProfile)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/fedora/bitshuffle/build/lib.linux-ppc64-3.5/bitshuffle/tests/test_ext.py", line 73, in tearDown
    self.assertTrue(np.all(ans == out.view(np.uint8)))
AssertionError: False is not true

======================================================================
FAIL: test_regression (bitshuffle.tests.test_regression.TestAll)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/fedora/bitshuffle/build/lib.linux-ppc64-3.5/bitshuffle/tests/test_regression.py", line 36, in test_regression
    == g_orig[dset_name][:]))
AssertionError: False is not true

----------------------------------------------------------------------
Ran 61 tests in 1.705s

FAILED (failures=7)
kiyo-masui commented 7 years ago

Okay, I'll need to think about it a bit more. I'm clearly missing something.

I suppose I should reference Blosc/c-blosc#181 here.

maropu commented 7 years ago

Hi, any update? We got an issue in BE ( AIX and s390x) platforms.

xerial commented 7 years ago

@kiyo-masui Actually this issue has been a blocker for releasing a new version of snappy-java, which includes bitshuffle APIs for Java. I appreciate your help.

kiyo-masui commented 7 years ago

Sorry for my slow reply, especially since it is blocking your release.

I haven't had time to work on this unfortunately. The status is as follows: there is a working patch in Blosc/c-blosc#181 but it is inadequate since you need to supply the endianness to the compiler by hand, and it adds a lot of extra computation for big-endian machines. This PR includes code that I think should work, but doesn't.

What needs to happen is for someone to compare the two patches against each other and figure out the issue with the code in this PR. I'm not sure when I'd be able to get to this. Anyone want to take a crack at it?

xerial commented 7 years ago

Thanks for the explanation. I understand the current situation.

odaira commented 7 years ago

I am trying to reproduce the problem on s390x now.

odaira commented 7 years ago

The patch in Blosc/c-blosc#181 emulates little endian's behavior on big endian, while the code in this PR produces outputs different from little endian's, although the outputs themselves are "correct" in that they are certain forms of transpose of the inputs and unshuffling restores the original inputs. I guess certain tests that assume little endian's bit-by-bit results can fail on big endian with the code in this PR.

What is the desired behavior? Is bitshuffle supposed to produce exactly the same shuffled results both on little endian and big endian? I think this is more desirable for data interchangeability across different-endian machines.

kiyo-masui commented 7 years ago

The desired behaviour is that of Blosc/c-blosc#181.

I think I might have just confused myself into thinking that what I wrote in this PR did the same thing. Reconsidering, I'm pretty sure I was wrong on this.

Okay, we need a portable way to do what was done in Blosc/c-blosc#181. Stack Overflow tells me I should detect endianness at runtime, not at compile time. That complicates making sure the solution is NOOP on little endian machines. Too keep branches out of the loops, I think we need 2 versions of the each of the functions that use TRANS_BIT_8X8.

Also we need a portable and efficient version of bswap_64. OSX and some other OS's don't seem to have byteswap.h.

kiyo-masui commented 7 years ago

Okay, I think this works.

It now swaps the bytes on BE machines. It uses the byte swap from byteswap.h on systems where that header is expected to be available, and otherwise uses a custom implementation. The custom implementation requires 15 instructions, compared to 18 instructions plus moving memory around for the rest of the algorithm. Bitshuffle will be about a factor of 2 slower on BE machines.

I did add a branch inside the loop, but, I've checked that the compiler (gcc 4.8) is able to optimize this away on my Mac. The assembly code it identical as before.

One thing that needs to happen is to expand and test the list of systems where byteswap.h should be present. Currently this is just #if defined (__linux__). I have no idea if this is always right (although seems to work on the Travis CI machines which are Ubuntu/Linaro).

Can someone test this on a BE machine?

odaira commented 7 years ago

Thanks much for the new patches, ... but can I work a little bit more on the BE version of TRANS_BIT_8X8 as we don't want to have additional overhead on BE?

kiyo-masui commented 7 years ago

Absolutely, that would be ideal!

I copy and pasted the code from http://www.hackersdelight.org/hdcodetxt/transpose8.c.txt. I was unable to dissect exactly how it works although it seems like it should be a fun problem.

At last now we know we don't have to worry about if (big_endian) statements in the loop, as long as we keep it simple.

kiyo-masui commented 7 years ago

This is superseded by PR #58 .