lebedov / msgpack-numpy

Serialize numpy arrays using msgpack
Other
197 stars 33 forks source link

The array unpacked is completely different than the packed one #15

Closed subiol closed 7 years ago

subiol commented 7 years ago

I have installed the recent msgpack-numpy 0.3.8 using pip. I am using python 3.6.0, numpy 1.12.0, msgpack-python 0.4.8 . The array I am getting out is completely different than the one that I packed. I have tried the example in the readme and it works fine, so I have create a short script that reproduces the behaviour:

import numpy as np
import msgpack
import msgpack_numpy as m

thedata = [(39133, 628009281, 103670,  4, b'a'),
           (39133, 628009283, 103670, 10, b'a'),
           (39198, 628010832, 103662, 28, b'b'),
           (43248, 628119055, 103247, 26, b'b'),
           (43252, 628119143, 103246,  6, b'b'),
           (43254, 628119186, 103246, 22, b'b')]

thedtype = np.dtype([ ('arg0', np.uint32),
                      ('arg1', np.uint32),
                      ('arg2', np.uint32),
                      ('arg3', np.uint32),
                      ('arg4', 'S1') ])

np_arr = np.array(thedata, dtype=thedtype)
print(np_arr)
x_enc = msgpack.packb(np_arr, default=m.encode)
decoded_arr = msgpack.unpackb(x_enc, object_hook=m.decode)
print(decoded_arr)

when I run this script I get:

[(39133, 628009281, 103670, 4, b'a') (39133, 628009283, 103670, 10, b'a') (39198, 628010832, 103662, 28, b'b') (43248, 628119055, 103247, 26, b'b') (43252, 628119143, 103246, 6, b'b') (43254, 628119186, 103246, 22, b'b')] [ [ -35 -104 0 0 65 -87 110 37 -10 -108 1 0 4 0 0 0 97] [ -35 -104 0 0 67 -87 110 37 -10 -108 1 0 10 0 0 0 97] [ 30 -103 0 0 80 -81 110 37 -18 -108 1 0 28 0 0 0 98] [ -16 -88 0 0 15 86 112 37 79 -109 1 0 26 0 0 0 98] [ -12 -88 0 0 103 86 112 37 78 -109 1 0 6 0 0 0 98] [ -10 -88 0 0 -110 86 112 37 78 -109 1 0 22 0 0 0 98]]

So as you can see the array packed and the array unpacked are very different.

lebedov commented 7 years ago

Try the latest revision on Github - if it fixes the problem, I'll post a new release.

subiol commented 7 years ago

Hi, this is what I get when I run the test using the last git version:

$ python test_msgpack_numpy.py [(39133, 628009281, 103670, 4, b'a') (39133, 628009283, 103670, 10, b'a') (39198, 628010832, 103662, 28, b'b') (43248, 628119055, 103247, 26, b'b') (43252, 628119143, 103246, 6, b'b') (43254, 628119186, 103246, 22, b'b')] 6 (6,) [('arg0', '<u4'), ('arg1', '<u4'), ('arg2', '<u4'), ('arg3', '<u4'), ('arg4', 'S1')] Traceback (most recent call last): File "test_msgpack_numpy.py", line 23, in decoded_arr = msgpack.unpackb(x_enc, object_hook=m.decode) File "msgpack/_unpacker.pyx", line 139, in msgpack._unpacker.unpackb (msgpack/_unpacker.cpp:2068) File "/home/jo/anaconda3/envs/mntest/lib/python3.6/site-packages/msgpack_numpy.py", line 71, in decode dtype=np.dtype(descr)).reshape(obj[b'shape']) TypeError: data type not understood

lebedov commented 7 years ago

Added another fix - try again.

subiol commented 7 years ago

Test is working fine now. I'll do some more work on this so I will test similar cases further, but it seems to be working fine now.

subiol commented 7 years ago

I have tried a couple more different data types that I use and I have found no problem, everything gets processed correctly. So for my part this issue can be closed.

lebedov commented 7 years ago

Package updated to 0.3.9.