Rotation is a significant factor in runtime

ana-096 commented 1 year ago

From profiling the function generate_polycubes, for larger values of n the time spent on rot90 is a significant factor, with 45 of the 108 second runtime being spent in it for n=9 (see table at bottom of text).

So far I've been thinking about a couple of potential solutions, but I haven't reach a point where either solution could be implemented:

Solution A - Writing a purpose-specific `rot90` or `all_rotations`

Numpy appears to use index swapping to achieve 90 degree rotation of an array (definitely faster than a transformation matrix I would imagine), but it has significant overhead as it is written to work for n-dimensional arrays. In the case of polycubes, we know that the array we want to rotate is 3D, and that we want all 28 rotations. Writing a purpose specific version of this function could save on overhead time, but I think it would most likely need to be done in another language.

I haven't really been pursuing this as I don't know any low level or performant languages to try this in, and I think we won't save time with a pure python implementation.

Solution B - Representing the polycube as three 2-D views

This solution appeals to me far more than solution A, as it reduces either our comparisons between hashes, or our memory used on storing hashes (if we store rotations). As we conveniently have a boolean array aligned with the axes, I think we can represent any polycube with three, fast to calculate, 2-D views of the polycube:

def view_2d(polycube):
    view1 = np.sum(polycube, axis=0)
    view2 = np.sum(polycube, axis=1)
    view3 = np.sum(polycube, axis=2)

    return view1, view2, view3

For the simple L shaped (cropped) polycube with a length of 4 and 1 cube off the a side, the views would be represented this way:

[4, 1], [1, 1, 1, 2], [1, 0, 1, 0, 1, 0, 1, 1]

We can avoid worrying about the order in which these appear by using collections.Counter or generally by treating it as an unsorted list. We could then compare the view to a rotated version of the polycube by ensuring that all entries match only once, hence finding that both polycubes are identical...

Unfortunately, though this works for comparing the two 1-D views of a polysquare with some additional checks, I haven't found a way to ensure these views match for polycubes after any of the 28 arbitrary rotations:

polycube1_views         -> [4, 1], [1, 1, 1, 2], [1, 0, 1, 0, 1, 0, 1, 1]
polycube1_rotated_views -> [1, 4], [1, 0, 1, 0, 1, 1, 1], [1, 1, 1, 2]

We can see from this that some rotations will give us outputs that are very hard to compare to the original polycube for equality.

Perhaps a solution to this is to rebuild the arbitrary polycube in a specific manner to always give the same output? I don't have a solution for this yet.

Profiling

cumtime is the value we're worried about for rot90 in this case, as looking at the source code for this function, it calls a number of other functions to achieve rotation by 90 degrees, with flip being the main point of concern.

With optimisations from the PRs on the original repo (bertie2, georgedorn, and mine), the cProfile.run output looks like this for n=9:

         67515198 function calls (64901683 primitive calls) in 108.627 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   669114    1.084    0.000    8.019    0.000 <__array_function__ internals>:177(any)     
     8151    0.013    0.000    0.055    0.000 <__array_function__ internals>:177(around)  
        1    0.000    0.000    0.000    0.000 <__array_function__ internals>:177(copyto)  
  1693830    2.401    0.000   21.180    0.000 <__array_function__ internals>:177(flip)    
  1578102    2.321    0.000    4.637    0.000 <__array_function__ internals>:177(packbits)
     8151    0.018    0.000    0.794    0.000 <__array_function__ internals>:177(pad)     
  1637369    2.486    0.000   45.442    0.000 <__array_function__ internals>:177(rot90)   
     8151    0.014    0.000    0.091    0.000 <__array_function__ internals>:177(round_)  
  1338228    1.863    0.000    6.435    0.000 <__array_function__ internals>:177(swapaxes)
   903376    1.308    0.000    6.317    0.000 <__array_function__ internals>:177(transpose)
        1    0.055    0.055  108.627  108.627 <string>:1(<module>)
    32604    0.023    0.000    0.023    0.000 arraypad.py:109(<genexpr>)
    32604    0.021    0.000    0.021    0.000 arraypad.py:120(<genexpr>)
    24453    0.125    0.000    0.155    0.000 arraypad.py:129(_set_pad_area)
    48906    0.030    0.000    0.030    0.000 arraypad.py:33(_slice_at_axis)
    16302    0.086    0.000    0.220    0.000 arraypad.py:454(_as_pairs)
     8151    0.002    0.000    0.002    0.000 arraypad.py:521(_pad_dispatcher)
     8151    0.169    0.000    0.746    0.000 arraypad.py:529(pad)
    24453    0.033    0.000    0.033    0.000 arraypad.py:58(_view_roi)
     8151    0.105    0.000    0.157    0.000 arraypad.py:86(_pad_simple)
  1578102    8.903    0.000   17.019    0.000 cubes.py:152(rle)
   223038    0.341    0.000    2.796    0.000 cubes.py:174(cube_exists_rle)
  1693830    4.488    0.000   40.548    0.000 cubes.py:24(single_axis_rotation)
   223038    7.844    0.000   22.298    0.000 cubes.py:44(crop_cube)
   231189    1.430    0.000   24.981    0.000 cubes.py:64(expand_cube)
  1411525    7.328    0.000   57.259    0.000 cubes.py:9(all_rotations)
      8/1    8.456    1.057  108.572  108.572 cubes.py:96(generate_polycubes)
   669114    0.179    0.000    0.179    0.000 fromnumeric.py:2328(_any_dispatcher)
   669114    1.046    0.000    5.503    0.000 fromnumeric.py:2333(any)
    16302    0.004    0.000    0.004    0.000 fromnumeric.py:3241(_around_dispatcher)
     8151    0.010    0.000    0.031    0.000 fromnumeric.py:3245(around)
     8151    0.011    0.000    0.067    0.000 fromnumeric.py:3754(round_)
  2249755    1.892    0.000    4.635    0.000 fromnumeric.py:51(_wrapfunc)
  1338228    0.290    0.000    0.290    0.000 fromnumeric.py:546(_swapaxes_dispatcher)
  1338228    1.331    0.000    3.073    0.000 fromnumeric.py:550(swapaxes)
   903376    0.196    0.000    0.196    0.000 fromnumeric.py:597(_transpose_dispatcher)
   903376    1.029    0.000    3.903    0.000 fromnumeric.py:601(transpose)
   669114    1.598    0.000    4.458    0.000 fromnumeric.py:69(_wrapreduction)
   669114    0.590    0.000    0.590    0.000 fromnumeric.py:70(<dictcomp>)
  1637369    0.357    0.000    0.357    0.000 function_base.py:154(_rot90_dispatcher)
  1637369   11.148    0.000   40.823    0.000 function_base.py:158(rot90)
  1693830    0.342    0.000    0.342    0.000 function_base.py:248(_flip_dispatcher)
  1693830    8.105    0.000   16.712    0.000 function_base.py:252(flip)
  3387660    0.842    0.000    0.842    0.000 index_tricks.py:765(__getitem__)
        1    0.000    0.000    0.000    0.000 multiarray.py:1079(copyto)
  1578102    0.333    0.000    0.333    0.000 multiarray.py:1175(packbits)
  1693830    5.360    0.000    7.474    0.000 numeric.py:1348(normalize_axis_tuple)
  1693830    1.112    0.000    1.469    0.000 numeric.py:1398(<listcomp>)
        1    0.000    0.000    0.000    0.000 numeric.py:150(ones)
  1693830    0.227    0.000    0.227    0.000 {built-in method _operator.index}
     8151    0.001    0.000    0.001    0.000 {built-in method builtins.callable}
        1    0.000    0.000  108.627  108.627 {built-in method builtins.exec}
  2249755    0.394    0.000    0.394    0.000 {built-in method builtins.getattr}
  1693830    0.290    0.000    0.290    0.000 {built-in method builtins.hasattr}
  5025116    0.608    0.000    0.608    0.000 {built-in method builtins.len}
       94    0.009    0.000    0.009    0.000 {built-in method builtins.print}
       87    0.000    0.000    0.000    0.000 {built-in method builtins.round}
   903376    1.746    0.000    1.746    0.000 {built-in method numpy.arange}
   247491    0.432    0.000    0.432    0.000 {built-in method numpy.array}
  1637369    0.242    0.000    0.242    0.000 {built-in method numpy.asanyarray}
    16302    0.008    0.000    0.008    0.000 {built-in method numpy.asarray}
7844473/5230965    8.905    0.000   56.396    0.000 {built-in method numpy.core._multiarray_umath.implement_array_function}
  1693830    0.357    0.000    0.357    0.000 {built-in method numpy.core._multiarray_umath.normalize_axis_index}      
     8152    0.008    0.000    0.008    0.000 {built-in method numpy.empty}
  1355064    0.495    0.000    0.495    0.000 {method 'add' of 'set' objects}
    56461    0.012    0.000    0.012    0.000 {method 'append' of 'list' objects}
     8151    0.021    0.000    0.021    0.000 {method 'astype' of 'numpy.generic' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
  1578102    3.479    0.000    3.479    0.000 {method 'flatten' of 'numpy.ndarray' objects}
     8151    0.002    0.000    0.002    0.000 {method 'get' of 'dict' objects}
   669114    0.090    0.000    0.090    0.000 {method 'items' of 'dict' objects}
    16302    0.037    0.000    0.037    0.000 {method 'nonzero' of 'numpy.ndarray' objects}
     8151    0.008    0.000    0.008    0.000 {method 'ravel' of 'numpy.generic' objects}
     8151    0.002    0.000    0.002    0.000 {method 'ravel' of 'numpy.ndarray' objects}
   669114    2.180    0.000    2.180    0.000 {method 'reduce' of 'numpy.ufunc' objects}
     8151    0.009    0.000    0.009    0.000 {method 'round' of 'numpy.ndarray' objects}
  1338228    0.499    0.000    0.499    0.000 {method 'swapaxes' of 'numpy.ndarray' objects}
   903376    1.842    0.000    1.842    0.000 {method 'transpose' of 'numpy.ndarray' objects}

bertie2 commented 1 year ago

I think your onto something with the views, the hard part will be proving that the views are unique, i'm going to run a test with the n=12 data to check.

as for an efficient way to compare the views, I think we just need to come up with a quick to compute canonical direction and order for the vectors. so in your example:

polycube1_views -> [4, 1], [1, 1, 1, 2], [1, 0, 1, 0, 1, 0, 1, 1] polycube1_rotated_views -> [1, 4], [1, 0, 1, 0, 1, 1, 1], [1, 1, 1, 2]

something which ranks [4, 1] higher than [1, 4] for example.

and something that ranks [1, 0, 1, 0, 1, 0, 1, 1] higher than, [4, 1]

the problem I foresee is after doing all these checks to get the right order, is it any faster than the rot90? more testing to be done.

bertie2 commented 1 year ago

retracted for wrongness ~~still trying to wrap my head around this but it seems you lose some data, and the views are non unique. these 2 cubes ended up with the same view with the following code:~~ ~~given they are mirrors of each other its probably ??? something to do with that. you can see the code I used to check for this here:~~ https://github.com/bertie2/opencubes/blob/view_test/test.ipynb ~~let me know if I've made any obvious mistakes, maybe my hash function is no longer applicable ?~~

bertie2 commented 1 year ago

ah wait i'm dumb, i was packing to bits and changing 2's into 1's

bertie2 commented 1 year ago

ok so you cannot in fact do a naive re-arrangement of the views based on their properties, the problem you hit is that certain rotations and flips swap 2 axis, the simplest example I could find was: Cube B Cube A

the rotation essentially swaps 2 of the views because it is the same shape along different axis.

the interesting part is ... that's a mirror!, and we have identified it as the same, the current code doesn't identify mirrors as the same, so is this actually an improvement in behavior ? or a regression ?

bertie2 commented 1 year ago

some of these are really hard to tell if they are a mirror or not just by looking at them, any idea on how to make proper code to check for mirrors?

bertie2 commented 1 year ago

I cant seem to get this to work, it does in fact collide more than just mirrors, at least with my naïve approach of sorting the vectors, (picking the largest out of each view and its reverse, and then ordering the views by size). I have pushed my code at https://github.com/bertie2/opencubes/tree/view_test this is not saying this idea cant work, just that I couldn't make it work

Playit3110 commented 1 year ago

I had opened an issue in cubes and had the idea of using radii to make it rotation independent. Maybe we could look into it. Or a multihash solution. Like not for 28 or so rotations but some specific ones and change all to a uniform structure, like to prefere some axis in the uniform structure. So we need less hashes and still count all shapes.

Edit: link to issue

VladimirFokow commented 1 year ago

@bertie2 ,

any idea on how to make proper code to check for mirrors?

# polycube_1 and polycube_2 are both cropped `np.int8` ndarrays

id_1 = get_canonical_packing(polycube_1, set())
id_2 = get_canonical_packing(polycube_2, set())
id_2_mirror = get_canonical_packing(np.flipud(polycube_2), set())

are_mirrors = id_1 != id_2 and id_1 == id_2_mirror

the current code doesn't identify mirrors as the same, so is this actually an improvement in behavior ? or a regression ?

Whether the mirrors should be counted as separate or not, this can be yet another flag of the main script. Here you see both counted: https://en.wikipedia.org/wiki/Polycube

To consider mirrors as same, we can e.g. use the np.flipud call in the get_canonical_packing function

ana-096 commented 1 year ago

I cant seem to get this to work, it does in fact collide more than just mirrors

The collision you have in test.ipynb is actually from a mirror operation! Consider mirroring the polycube along the X/Y plane, where the furthest cube (along our Z, or depth of the image) is our starting point. This is the same polycube but rotated.

I think the fundamental issue we will likely run into is that this allows "illegal" rotations, or mirroring, which are actually just 4D rotations, or p(n) as described by Kevin Gong. The same actually happens in 2D, representing the two skew tetrominoes S, Z this way (3D rotation to achieve mirroring). This could be useful for calculating p(n), which tends towards 2 * r(n). This could still be useful for r(n) as we could then do a search specifically for mirrored cubes, and "unmirror" them. I'll keep looking into any other collisions and see if we run into any other collisions that aren't mirrors.

ana-096 commented 1 year ago

Just an afterthought, it makes heaps of sense that p(n) and r(n) appear to be linked by just a factor of 2. As n grows, we get more polycubes with chirality, relative to the number of achiral polycubes, and for each chiral polycube we will get exactly 1 other polycube with the inverse chirality. This is a lot easier to think about in 2D, where we will find more polysquares for larger values of n that we can mirror (3D rotate) in such a way that no 2D rotation can achieve the same shape.

VladimirFokow commented 1 year ago

@ana-096 ,

Solution B - Representing the polycube as three 2-D views

for example, if shape is (4,4,4), then there are 64 variables (that can be 0 or 1), but only 3*16=48 equations that constrain them -> for some system of these equations more than 1 solution could be possible. This probability grows with shape.

Is this not a problem? I'll try to generate a collision and update this.

VladimirFokow commented 1 year ago

@ana-096 , example of collisions using the "Solution B" https://gist.github.com/VladimirFokow/c314ab45dbe1c97d763b4e81a0a3728e The views aren't unique.

(I think that c++, and "Solution A" are the way to beat the record...)

But thinking about more efficient representations of polycubes, do you agree that the total storage requirements cannot theoretically be smaller than described here?: https://github.com/mikepound/cubes/issues/14#issuecomment-1636304106

On the other hand, if the point of this new representation is not to never have collisions, but to overcome the problem of wasting too much time on rotations, I agree that it could make sense to implement it: resolving a few collisions could be faster than rotating literally all off the polycubes.

To go further with this, need to develop code how to:

ensure that the canonical version of the views of all 24 rotated versions of a polycube are the same (but not mirrors)
resolve collisions with other polycubes

ana-096 commented 1 year ago

@VladimirFokow your example is for a polycube n=33, did you find any smaller collisions? I did some quick tests for arbitrary shapes 2 < x,y,z < 10 (yes this is taking a long time) and haven't found any smaller collisions yet. Kevin Gong only got to n=16, so if there's no issues for collision, or we account for it, up to n=17, this could help us get further.

VladimirFokow commented 1 year ago

@ana-096 results for n=9 and n=10: https://gist.github.com/VladimirFokow/aebc3dc937d0585018690a8aceba3ebf

You can download the polycubes which will have collisions (for n=9 and n=10) as .npy files: https://drive.google.com/drive/folders/1vaSHbO0QqCv2PT6ohJNmihpNh7_n73ms?usp=sharing
sorry, no code for finding them because it's very messy and maybe not optimal: already at n=10 it was taking around 20 minutes to find all the collisions. Will share code if needed.
I may have made some oversight or missed some case, but I hope I didn't.

ana-096 commented 1 year ago

@VladimirFokow are you checking for mirrors? The examples you provided for n=9 I think are mirrors, but it's hard to tell when it's represented as an array.

for example, if shape is (4,4,4), then there are 64 variables (that can be 0 or 1), but only 3*16=48 equations that constrain them -> for some system of these equations more than 1 solution could be possible. This probability grows with shape.

Is this not a problem? I'll try to generate a collision and update this.

For a (4,4,4) boolean array, there are 2 ^ 64 values, represented as 2D views, they would each be (4,4) integer arrays, where the value of any given cell is 0 <= x <= 4, hence 5 ^ 16 values. This is approximately 2 ^ 37 values.

In general for 2 dimensions, x, y, the number of possible values is 2 ^ (xy), while the 1D views would be (x+1) ^ (y) and (y+1) ^ (x). The number of possible values in the x,y sized boolean array grows faster than the views, however I haven't yet figured out how many of the values can be attributed to mirrors and rotationally symmetric arrays.

VladimirFokow commented 1 year ago

@ana-096

are you checking for mirrors?

Of course; I wouldn't claim that they are mirrors without checking it😁 I use this idea to check for mirrors (or you can also check for equality) it's pretty simple: https://github.com/mikepound/opencubes/issues/9#issuecomment-1634794247

I haven't yet figured out how many of the values can be attributed to mirrors and rotationally symmetric arrays

I don't see a way to logically derive a formula for this number. The way to proceed with this is to try to implement this method, and see practically whether it improves the computation time or not. (Here are the next steps)

For n=9 and n=10 there are really not so many collisions (only 4 and 43). So maybe this method would actually be helpful until n=17 -- if it doesn't grow astronomically fast there, I don't know.
(As a side task of this new implementation, maybe also all the collisions for higher n can be detected, counted and saved.)

but note that memory will be less efficient if you represent a cube with "Solution B". With our current approach, we use 1 bit per variable, so for (4,4,4) example it's around 64 bits (we also store shape). But 3*16=48 numbers would take 48*8 bits. Yes, in the long-term n*n*(3*8) < n*n*n, but for this to hold n should be n>24 (n is a side length of a hypothetical polycube that has equal side lengths). So until n=24, "Solution B" is worse memory-wise

mikepound / opencubes