CodingTrain / Suggestion-Box

A repo to track ideas for topics
570 stars 86 forks source link

Coding Challenge: Rubik's Cube Solver using Neural Network #956

Open vimkat opened 6 years ago

vimkat commented 6 years ago

As suggested by @vladig98 in https://github.com/CodingTrain/website/issues/573:

Coding challenge: A fully functional Rubik's cube solver. Something similar to this https://www.google.com/logos/2014/rubiks/iframe/index.html. Also use a neural network to create a solution for the cube and maybe then apply it. That'll be cool.

JohnyDL commented 6 years ago

I think this would be interesting but I'm struggling to think of an approach to it, most solvers find solutions by identifying situations and applying common solving algorithms, stage one assess which cross is easiest to form (<7 moves) bonus for partial F2L at the same time, stage 2 apply 1 of the 77(ish) advanced case F2L algorythms (upto 44 moves usually a lot less), stage 3 PLL/OLL 1 of 57 cases (up to 11 moves) but trying to teach a neural network to produce a solution it might be hard.

Applying a move by move approach seems like the best option to me but what training data do you give it, there's trivial data one, two, three, move solves, but how do you tell the network that a given move is better or worse than another when you're 10 moves away the cube looks random and you can't analyse the solution space to full depth? You could of course just brute force teach it deeper and deeper trivial solves until it's seen every single state but what you'd end up with isn't really a neural network that can solve it but a look up table for the 43,252,003,274,489,856,000 configurations.

Also how do you input the data, humans basically take in 54 inputs of 6 colours but is that really the best way for a computer, cause I'd interpret that as 324 binary neurons for input? but There are only 27 cubies, that's 7 static pieces in the core, 12 edges and 8 corners, each of the edges and corners can be in 24 positions (edges 12 locations 2 orientations, corners 8 locations 3 orientations) maybe representing the inputs in some way that encodes that would be better rather than requiring hidden layers to interpolate, or not, I can't tell. Do you encode for symmetries or not? I'd say it'd be better if the inputs did basically compensate for symmetries but again how?

However it's input I'm pretty sure it'd require multiple hidden layers, a tensor flow project might be good with this, if there's solutions to the inherent problems. I'm gonna keep thinking about it anyway just thought others might be able to answer the questions too.

mlwyatt commented 6 years ago

I've been thinking about this for the past few days because I recently watch the NN videos. What about having a single output for "God's algorithm" (20 or fewer moves) that way you don't have to look for patterns and apply OLL etc; you find the shortest solve instead of most intuitive. You would have to supply a "big enough" training data set with known "God's algorithm" to ensure it can solve scrambles outside of the training set

JohnyDL commented 6 years ago

The God's Algorithm approach would hopefully be what one move at a time should give, since the find patterns approach only works if it's coordinated and can return algorithms not single steps, but the God's algorithm isn't really an algorithm. I referred to it above as a database of all states linked in a way that the states lead to the solve. In effect it's a tree containing the 43,252,003,274,489,856,000 configurations as nodes and indexed in a way that makes searching it for a given configuration easy.

As for a training set you can probably teach all the 4, or 5 God move positions by brute force but you could never prove enough of the deeper levels to provide enough training data for the nn to stand a chance at solving average scramble. There are 18 possible moves at each level, (6 faces turn 90, 180 270) and some of the positions would overlap but still that's a training set of up to ~50 million positions and you haven't made a dent in the total solution space.

Interestingly there might be a way to solve it as a two player game though, treat it as a game of chess from a random position odd moves are trying to get to the classic solution and the even moves aims for the anti solution (edges flipped) I'd think that to begin with the two players might get lucky (eventually) if we know some of the near misses for the solution and anti solution then they could be enough to define a win (maybe the AI could iterate on its own for 20 steps and if it hits the solution then back propagate for that as a win and it's weight is stronger for fewer steps) for millions of moves though each AI will likely never reach the actual solution but it might reach within 4 moves so while in the beginning steps of training that constitutes a win for it. Hopefully after a long while of battling the AI might be able to get to their solutions quickly if the opposition wasn't in their way (and the two AI are actually the same AI just with some inverted inputs) so it'd be like alpha zero playing itself at chess. Getting good by trying to counter it's own bad strategies.

mlwyatt commented 6 years ago

@JohnyDL that makes sense. What are the 4 or 5 God move positions? I've never heard of that before.

JohnyDL commented 6 years ago

Positions which take 4 or 5 moves to solve

God move 1 would be any face with a 90, 180 or270 degree twist: U, U2, U', L, L2, L', F, F2, F', R, R2, R', B, B2, B', D, D2, D'

God move 2 would be any god move 1 position with another move 1/18 will be the original position (UU', U2U2, U'U...), 2/18 will be other God mode 1 possitions (UU, UU2, U2U, U2U'...), you can end up with duplicates (UD = DU) and the remaining ~5/6ths are provably 2 moves from the solution with no better alternative.

God move 3 would be a god move 2 with another move again 1/18th becomes a god move 1 position and 2/18ths become alternate god move 2s, again position duplication and ~5/6ths are God move 3 positions

The idea being they are starting positions where the God algorithm would solve in 1 2 or 3 moves same thing extends to 4 or 5 or 6 or n and for the relatively low ~50 million positions that are 5 moves or less from the solution they're more or less computable in a reasonable time

I worked out a bunch by hand, in the following I'm going to assume that mirror images of the cube and whole cube rotations are ignored.

God move 1, 2 positions: U, U2 (U' is ignored because it's a mirror of U. R, L, F B and D moves are all rotations of U) God move 2, 9 positions: UR, UR2, UR', UD, UD2, UD', U2R, U2R2, U2D2 (U2D and U2D' are ignored because they're mirrors of UD2 and U2R' is a reflection of U2R) God move 3, 71 positions: URU, URU2, URU', URF, URF2, URF', URL, URL2, URL', URB, URB2, URB', URD, URD2, URD', UR2U, UR2U2, UR2U', UR2F, UR2F2, UR2F', UR2B, UR2B2, UR2B', UR2D, UR2D2, UR2D', UR'U UR'U2, UR'U2, UR'F, UR'F2, UR'F', UR'B, UR'B2, UR'B', UR'D, UR'D2, UR'D', UDR, UDR2, UD2R, UD2R2, UD2R', U2RU, U2RU2, U2RU', U2RF, U2RF2, U2RF', U2RL, U2RL2, U2RL', U2RB, U2RB2, U2RB', U2RD, U2RD2, U2RD', U2R2U, U2R2U2, U2R2U', U2R2F, U2R2F2, U2R2F', U2R2B, U2R2B2, U2R2B', U2R2D, U2R2D2, U2R2D' God move 4, a lot more positions than I'm willing to work out by actually making the moves on my cube but it starts: URUR, URUR2, URUR', URUF, URUF2, URUF', URUL, URUL2, URUL', URUB, URUB2, URUB', URUD, URUD2, URUD'... repeat the process 70 times for the other 70 God move 3's except omit mirrors and rotations and checking each sequence doesn't result in a position we've already assigned notation for

pranavgade20 commented 6 years ago

I think that the best approach would be using 3-5 Recurrent Neural Networks.

  1. To solve the top layer, with T shape on the side.
  2. Solve the Middle Layer. 3-5. Solve the + on the bottom, Arrange the edges, Solve the edges.

Yeah, it will be a difficult task, but the same neural network with different training datasets should be able to do the job.

JohnyDL commented 6 years ago

pranavgade, what would be the point in using Neural Networks in the standard identify situation -> pick algorithm method? there's a list of algorythms as long as my arm that're basically plug and play here's one OLL, PLL set: http://www.kupendeza.com/wp-content/themes/medicenter/images/solve_pdf/how_to_solve_rubiks_cube_advanced.pdf

If you're breaking up the Network from 1 does it all network into 3-5 "does this thing we already have an answer for" subsections. And you're not expecting magic but the same thing that a classical algorithm can do 1000x faster, then why bother NN at all?

pranavgade20 commented 6 years ago

I think that it would be more modular if we use multiple neural networks instead of just one.

Also, there are good algorithms out there which can solve the cube in a reasonable amount of time,like this one http://www.cube20.org/src/ but the question asks SPECIFICALLY for a neural network, and notan algorithm. Indeed, the point of ML is to make a given task simpler, with less code. The algorithms require a lot of branching and conditional statements, which is what we are trying to avoid.

tl;dr : Question asks for neural net, multiple neural networks would be easier to code, and faster.

JohnyDL commented 6 years ago

What do you mean more modular? And how exactly is it easier to code fuzzy positions? And how is it faster having to hit any arbitrary substate(s) of the cube on the way to the solution?

I'm assuming here you'll need to encode the standard fuzzy positions to hit (cross, f2l, oll, pll) on route, so do you give it to network A and it gives you moves to the cross and then hand it next to network B for F2L etc? So the NN just analyses states and picks from algorithms on offer to get to the next desired state? Which isn't really an NN that "creates a solution" it's one that decides between existing solutions. Or are you feeding in the cube step by step and expecting progress to the fuzzy midpoints? In which case why do you need the fuzzy midpoints at all? Coding fuzzy positions as targets isn't any easier to code than coding the end goal, which you need anyway, and the training data for fuzzy positions is equivalent to the final solution, all you're doing is recolouring the positions I enumerated 10 days ago with some fuzziness. And even then by adding in fuzziness you're actually making the guarantee of progress less likely.

Having midpoints can lead to conflicts and loops, there will be times during say F2L where a move made to put in some pieces undoes previous F2L steps, humans and hardcoded solutions work with algorithms but we've already said we don't want that we want something step wise so midpoints are in effect high scoring failures, heading direct to the solution never has that problem.

There's also the problem of choosing what state you're initially in, if the several module NNs are handed a cube only a few moves from the solution would it ruin this by heading for a cross? Or be able to identify where it is, I can give you a cube 6 moves from a solution that looks scrambled (FBLRFB looks fairly scrambled at a glance) but I still contend heading to the cross is backwards progress in that state.

Also the one 'complicated' NN is expandable to 4x4x4 or larger cubes or a 3x3x3x3 hyper cube without too much difficulty other than massaging the inputs (more cubies), outputs (more layers that can be twisted), network (just add nodes to layers and/or layers in total) and training data (which I manually figured out but could be automated to some degree by iteration), going modular means it'd need new fuzzy states to be defined and a lot more massaging including manually created training data from existing algorithms for each expansion rather than automation.

As for your TL;dr: the question asks for "a neural net" not "multiple neural networks", coding the first net is the hardest, to code N you still need to code the first so it's not "easier to code" and anyway by adding nets you create more complications than you solve. And is that faster to code? No more complications to the coding means more to think about and plan for and more time. Faster to run/train? No cause you've got to train multiple networks which means multiple different training sets and extended training time cause each needs separate training.

vladig98 commented 5 years ago

Here's a video on the topic https://www.youtube.com/watch?v=f9smvQ5fc7Q

pranavgade20 commented 5 years ago

@vladig98 It does solve the cube, but it is all algorithmic i.e. NO AI/Neural Network involved, which is what this issue asks for. So, I'd say that that video is not related.

pranavgade20 commented 5 years ago

@JohnyDL I think you are correct for the most part, but I'd say that having multiple neural nets would significantly reduce training time.(yeah, that is what I wanted to say when I said faster - faster to train and code, not faster performence)

vladig98 commented 5 years ago

OK. I saw the live stream and I wanted to share this https://github.com/vladig98/RubixCube I created the Rubik's Cube in JavaScript a while ago and I was implementing a lot of algorithms for fast solving (F2L method with full OLL and PLL) but I was not able to create the cube itself and I wasn't able to create the AI to solve the cube. Basically my idea is something like use an AI to pick one of the algorithms and apply it. If it is the correct algorithm keep on otherwise the AI fails and starts over. I've used if statements to determine which algorithm should be used but the AI could pick a random one until it gets the correct algorithm. That'll be better than doing random moves and eventually running all 34 quintillion different possibilities. It'll be nuts. My code has its issues. Also some of the algorithms are wrong and I haven't fixed them but you get the basic idea. Instead of using those if statements the AI can pick a random algorithm and once it's done to keep on until it solves the cube. However this will not work on a bigger cube such as 4x4x4 because there are other algorithms that should be implemented.

pranavgade20 commented 4 years ago

Just came across something: https://doi.org/10.24433/CO.4958495.v1 Although it uses reinforcement learning, not using neural networks, i think this is the closest this thread has got to the original issue.