Closed impulsecorp closed 6 years ago
I will answer only the first part and left the error part to Guillem. We have 2 flavours of the same algorithm:
1) FMC: this one is a step-by-step process. is run after every little step the system takes, so you decide about your next step growing a fractal to explore next X steps, do whatever you decided, and then repeat (decide again from scratch). It is not so efficient as the whole fractal you created to decide "go left" is totally deleted after going left, but at the same time the agent will adapt to unexpected changes in non-deterministic environments, so it is very robust to noise.
Video example: https://youtu.be/yx695HfQoMY
2) Swarm Wave: if you know the environment is deterministic, then you can grow your fractal, like in the first step of an FMC but, instead of stopping at X steps in the future and decide next one step, you continue growing the fractal towards the future until the game is solved or all your paths drive you to dead. If it was solved by one of the paths created, then you walk-back the game from this winning node and build a complete sequence of decisions taking you from start to a winning end.
Video example: https://youtu.be/0t7jI9WdTWI
So, for a real robot, use FMC, but for generating big numbers of high scoring records, or for beating records using fewer samples than others to impress the people, use Swarm Wave.
The error I am getting seems related to rendering (which is a problem on an AWS Linux server), so I will figure it out. If I change this to False: s.run_swarm(print_swarm=False)
then it shows it ran it with no errors:
CPU times: user 13.6 s, sys: 280 ms, total: 13.9 s
Wall time: 1min 11s
This shouldn't be happening. If you run into more troubles use my personal fork that I have not merged yet. I will make the pull request ASAP, but meanwhile you can use this: https://github.com/Guillemdb/FractalAI/tree/learning
AttributeError Traceback (most recent call last)
When I try your new FMC_example.ipynb it saves the video files in /videos but I want the SARS data from the training runs, like to use in DQN. That is why I was using the swarm version.
My advice is that you forget about the demos in the FractalAI repository. They are totally outdated. I will try to make them work again today, but maybe it takes some time because I am currently at the EuroPython conferences.
Meanwhile if you are using my fork, use this notebook as an example https://github.com/Guillemdb/hacking-rl/blob/master/demo.ipynb I tried it yesterday on my talk, and both algorithms worked well.
To generate data data. in the cell number [21], change mode="load" for mode=save, and make sure the target folder exists. That should be enough to create some data, but you can also use the included data generators found in the last cells.
Thanks, the page at https://github.com/Guillemdb/hacking-rl/blob/master/demo.ipynb is exactly what I needed. I don't need you to fix the old demos.
At first I had a problem with the new demo at https://github.com/Guillemdb/hacking-rl/blob/master/demo.ipynb, but disabling NetworkX (http://networkx.github.io) fixed it.
I am confused by your Swarm example at https://github.com/FragileTheory/FractalAI/blob/master/swarm_wave_example.ipynb as compared to your FMC example at https://github.com/FragileTheory/FractalAI/blob/master/FMC_example.ipynb
In your Abstract section, you write about Swarm: "This implementation is far more efficient than FMC, effectively "solving" a substantial number of Atari games." Does that mean I should use your Swarm example for playing Atari games instead of your FMC example? The great scores you list in your table at https://github.com/FragileTheory/FractalAI show "FMC" as the method.
Also, your FMC example works on my server, but your Swarm example at https://github.com/FragileTheory/FractalAI/blob/master/swarm_wave_example.ipynb does not work, I get this error:
TypeError Traceback (most recent call last)