Crunch-io / scrunch

Pythonic scripting library for cleaning data in Crunch
GNU Lesser General Public License v3.0
5 stars 7 forks source link

Why is scrunch.cubes.crtabs now returning Cube instead of CrunchCube object? #390

Closed alextanski closed 3 years ago

alextanski commented 3 years ago

In the past, scrunch.cubes.crtabs was returning a cr.cube.crunch_cube.CrunchCube instance but now instead returns a cr.cube.cube.Cube.

This breaks some functionality for DP and it seems as if the current version of cr still supports the "old" (?) higher level implementation of the CrunchCube, e.g. for quickly building and (re-)indexing a pandas.DataFrame.

Is there a reason why this has changed and/or can we roll-back this?

xbito commented 3 years ago

@alextanski seems like a question that needs to be asked to Crunch, this may be the commit in which it happened: https://github.com/Crunch-io/scrunch/commit/42c356b11cfdfeb47cc2da675fc08b8d3f7bb7e1 by @slobodan-ilic ?

slobodan-ilic commented 3 years ago

We haven't changed anything in scrunch, but the cube itself changed. What functionality is broken? I'm sure it can be fixed by using the appropriate "new" functionality. @xbito @alextanski ☝️

slobodan-ilic commented 3 years ago

@alextanski The old cube we haven't touched in about two years now. If you really need the old functionality, it's probably best that you pin the cube to a version < 2.0.0. But I'd suggest to change to new functionality, it should be much easier to use. I'm here to help whatever you decide.

alextanski commented 3 years ago

@slobodan-ilic Thanks for looking into this. We are exploring the "new" Cube functionality but still at some point the method inside scrunch started to return a Cube rather than the "old" (?) CrunchCube . I did take a look now again at https://github.com/Crunch-io/crunch-cube/ and in this version the CrunchCube is not present indeed.

Maybe this comes from an dependency installation hiccup on our end then. We are going to double-check this.

So for example, is there any Cube replacement for the CrunchCube.as_array() method? Can you point us to where the old functionality would now be located in then?

alextanski commented 3 years ago

I think I have found a starting point to work from. We can close this ticket. I will check back at the crunch-io Slack channel for specific questions (if any).

slobodan-ilic commented 3 years ago

@alextanski Sure, just ask if there's anything not clear, or if there are things you haven't used. As for the as_array(), sure there are alternatives. In particular, you should use the Cube.counts property, as described in this test (but there are also many other tests with a plethora of examples).

Also, one additional thing to consider: we base logic around the 2D slice of the cube (which is the same as cube, if they're 2D, but not if they're 3D). They're called partitions and can be accessed like so.