blaze / datashape

Language defining a data description protocol
BSD 2-Clause "Simplified" License
183 stars 65 forks source link

Need tests to verify whether ds == eval(str(ds)) for all datashapes #90

Closed talumbau closed 9 years ago

talumbau commented 10 years ago

A recent PR (https://github.com/ContinuumIO/datashape/pull/89) came about because one could make a datashape where eval(str(ds)) != ds. It would be helpful if we exhaustively tested this within datashape.

mrocklin commented 10 years ago

+1

brittainhard commented 10 years ago

Hows this for a test scheme?

Test each dtype individually, then test a combination of a datashape containing 3, 5, then a larger number like 10-15?

brittainhard commented 10 years ago

Also some ragged arrays thrown in.

brittainhard commented 10 years ago

Look what I found!

>       self.assertEqual(str(dshape('complex64')), 'complex64')
E       AssertionError: 'complex[float32]' != 'complex64'
mrocklin commented 10 years ago

Sure, I suggest a list of datashapes and a test that asserts eval(str(ds)) == ds for each one.

On Mon, Sep 15, 2014 at 3:22 PM, brittainhard notifications@github.com wrote:

Also some ragged arrays thrown in.

— Reply to this email directly or view it on GitHub https://github.com/ContinuumIO/datashape/issues/90#issuecomment-55644084 .

brittainhard commented 10 years ago

@mrocklin are we talking about the built in function eval()? iI couldnt get that to work. TJ suggested i model it on his test that came with his PR, which is str(dshape('int64')) and the like.

mrocklin commented 10 years ago

Hrm, lets try repr

In [1]: from datashape import *

In [2]: dshape('int64')
Out[2]: dshape("int64")

In [3]: str(dshape('int64'))
Out[3]: 'int64'

In [4]: repr(dshape('int64'))
Out[4]: 'dshape("int64")'

In [5]: eval(repr(dshape('int64')))
Out[5]: dshape("int64")

In [6]: eval(repr(dshape('int64'))) == dshape('int64')
Out[6]: True
cpcloud commented 10 years ago

Fwiw when you have a list of things that you need to iterate over and perform the same comparison operation on, pytest parametrized fixtures are very handy. You proved the list and the operation, and it does the looping for you. For Cartesian product style tests this is very useful. Of course it sill gives you nice feedback as usual.

brittainhard commented 10 years ago

Thanks @cpcloud i'll do that.

mrocklin commented 10 years ago

This is a fairly simple operation. I'm not sure I would invoke pytest magic here. Just a style opinion though.

cpcloud commented 10 years ago

Yep maybe not necessary at this point. Just planting seeds >:)

mrocklin commented 10 years ago

In regards to fixtures, I actually think that there is some non-trivial cost to their use.

My general thought is that we should stick to simpler and more widely understood technologies and idioms if possible. This is similar to my thoughts of overusing toolz or multipledispatch. Yes, they're very slick idioms, but they do add an extra requirement to understanding code. They should be used when the core solution would be ugly or less effective, but should probably be avoided if that can be made convenient.

cpcloud commented 10 years ago

@mrocklin i'm happy either way, sometimes i let my fondness for terse code get in the way.

cpcloud commented 9 years ago

i think truly exhaustive testing of this is not practically doable. However a subset of dshapes that tries to characterize the scope of possible dshapes is definitely doable. Putting up a PR in a few