Blosc / bloscpack

Command line interface to and serialization format for Blosc
BSD 3-Clause "New" or "Revised" License
122 stars 27 forks source link

Bugfix: structured arrays #22

Closed esc closed 9 years ago

esc commented 9 years ago

Closes #16

esc commented 9 years ago

The problem with the current implementation is that it isn't idempotent:

In [3]: bloscpack.numpy_io._fix_numpy_metadata([('a', 'S1'), ('b', 'f8')])
Out[3]: (('a', 'S1'), ('b', 'f8'))

This happens, when the metadata is part of a CompressedSource but hasn't been previoulsy serialized via JSON.

The test-suite yields the following error:

$ nosetests test/test_numpy_io.py 
.................................................................................................................................................................................................................................................................E.........
======================================================================
ERROR: Failure: ValueError (invalid shape in fixed-type tuple.)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/esc/anaconda/lib/python2.7/site-packages/nose/loader.py", line 251, in generate
    for test in g():
  File "/home/esc/gw/bloscpack/test/test_numpy_io.py", line 131, in test_numpy_dtypes_shapes_order
    for case in roundtrip_ndarray(a):
  File "/home/esc/gw/bloscpack/test/test_numpy_io.py", line 83, in roundtrip_ndarray
    yield roundtrip_numpy_memory(ndarray)
  File "/home/esc/gw/bloscpack/test/test_numpy_io.py", line 45, in roundtrip_numpy_memory
    b = unpack_ndarray(source)
  File "/home/esc/gw/bloscpack/bloscpack/numpy_io.py", line 220, in unpack_ndarray
    sink = PlainNumpySink(source.metadata)
  File "/home/esc/gw/bloscpack/bloscpack/numpy_io.py", line 116, in __init__
    dtype=numpy.dtype(_fix_numpy_metadata(metadata['dtype'])),
ValueError: invalid shape in fixed-type tuple.

----------------------------------------------------------------------
Ran 267 tests in 2.328s

FAILED (errors=1)
esc commented 9 years ago

Perhaps it would be better to attach this to the JSON Serializer and ensure that the data comes back out unchanged. Will probably be an issue for Python 3 also, due to the str and unicode equivalence.

esc commented 9 years ago

@dmbelov : I made a super hackish fix in 03ec966, let's wait and see what Travis-CI says.

esc commented 9 years ago

Having some weird segfaults when running nosetests locally on my machine.

anaconda [bloscpack:vh/bugfix/structured_arrays:★★★★★★] ~/gw/bloscpack esc@toolbox ₍★₎
zsh» nosetests test/test_numpy_io.py
..............................................................................................................................................................................................................................................................................................
----------------------------------------------------------------------
Ran 286 tests in 0.628s

OK
anaconda [bloscpack:vh/bugfix/structured_arrays:★★★★★★] ~/gw/bloscpack esc@toolbox ₍★₎
zsh» nosetests 
..................................................................................................................................................................................................................................................................................................................................................................................................................................................................[3]    15026 segmentation fault (core dumped)  nosetests
nosetests  5.14s user 0.64s system 156% cpu 3.702 total
esc commented 9 years ago

The segfault dissapears when removing all *pyc files and comes back after running once.. I'll have to investigate further.

esc commented 9 years ago

Here, check this out:

anaconda [bloscpack:vh/bugfix/structured_arrays:★★★★★★] ~/gw/bloscpack esc@toolbox ₍★₎
zsh» nosetests
..................................................................................................................................................................................................................................................................................................................................................................................................................................................................[3]    16290 segmentation fault (core dumped)  nosetests
nosetests  5.15s user 0.61s system 158% cpu 3.637 total
anaconda [bloscpack:vh/bugfix/structured_arrays:★★★★★★] ~/gw/bloscpack esc@toolbox ₍★₎
139 zsh» rm -rf **/*.pyc
anaconda [bloscpack:vh/bugfix/structured_arrays:★★★★★★] ~/gw/bloscpack esc@toolbox ₍★₎
zsh» nosetests
..............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
----------------------------------------------------------------------
Ran 478 tests in 3.845s

OK
nosetests  5.70s user 0.53s system 156% cpu 3.983 total
anaconda [bloscpack:vh/bugfix/structured_arrays:★★★★★★] ~/gw/bloscpack esc@toolbox ₍★₎
zsh» nosetests
..................................................................................................................................................................................................................................................................................................................................................................................................................................................................[3]    16469 segmentation fault (core dumped)  nosetests
nosetests  5.13s user 0.64s system 158% cpu 3.632 total
esc commented 9 years ago

I improved the solution to deserialzing numpy arrays but am still getting unexplained segfaults when compiled python files are present...

esc commented 9 years ago

It segfaults on travis too....

esc commented 9 years ago

I commented out some tests and am attempting to solve this. Need to rebase against maint-0.7.x to include some unrelated fixes from there.

esc commented 9 years ago

It's green again, time to figure out why the commented tests segfault this thing.

esc commented 9 years ago

closing in favour of #24