There was a mismatch between the observation space defined in the environment (13) and the actual state dimension returned (9). It is now fixed to be 9 representing the 3d position of the ball, the maze, and the goal.
The termination condition of the episode was set to be always 'False'. It will now change to 'True' when the ball either falls from the maze or successfully reaches the goal point.
Removed an unnecessary update of the goal when getting the new positions of the scene components at each step, since the goal doesn't change between the steps, it only changes after each episode.