impulsecorp commented 6 years ago

I am confused by your Swarm example at https://github.com/FragileTheory/FractalAI/blob/master/swarm_wave_example.ipynb as compared to your FMC example at https://github.com/FragileTheory/FractalAI/blob/master/FMC_example.ipynb

In your Abstract section, you write about Swarm: "This implementation is far more efficient than FMC, effectively "solving" a substantial number of Atari games." Does that mean I should use your Swarm example for playing Atari games instead of your FMC example? The great scores you list in your table at https://github.com/FragileTheory/FractalAI show "FMC" as the method.

Also, your FMC example works on my server, but your Swarm example at https://github.com/FragileTheory/FractalAI/blob/master/swarm_wave_example.ipynb does not work, I get this error:

TypeError Traceback (most recent call last)

in () ~/fractalai/fractalai/swarm.py in run_swarm(self, state, obs, print_swarm) 500 self._i_simulation += 1 501 if self._i_simulation % self.render_every == 0 and print_swarm: --> 502 print(self) 503 clear_output(True) 504 except KeyboardInterrupt: ~/fractalai/fractalai/swarm_wave.py in __str__(self) 43 text = super(SwarmWave, self).__str__() 44 if self.save_data: ---> 45 efi = (len(self.tree.data.nodes) / self._n_samples_done) * 100 46 sam_step = self._n_samples_done / len(self.tree.data.nodes) 47 samples = len(self.tree.data.nodes) TypeError: object of type 'method' has no len()

sergio-hcsoft commented 6 years ago

I will answer only the first part and left the error part to Guillem. We have 2 flavours of the same algorithm:

1) FMC: this one is a step-by-step process. is run after every little step the system takes, so you decide about your next step growing a fractal to explore next X steps, do whatever you decided, and then repeat (decide again from scratch). It is not so efficient as the whole fractal you created to decide "go left" is totally deleted after going left, but at the same time the agent will adapt to unexpected changes in non-deterministic environments, so it is very robust to noise.

Video example: https://youtu.be/yx695HfQoMY

2) Swarm Wave: if you know the environment is deterministic, then you can grow your fractal, like in the first step of an FMC but, instead of stopping at X steps in the future and decide next one step, you continue growing the fractal towards the future until the game is solved or all your paths drive you to dead. If it was solved by one of the paths created, then you walk-back the game from this winning node and build a complete sequence of decisions taking you from start to a winning end.

Video example: https://youtu.be/0t7jI9WdTWI

So, for a real robot, use FMC, but for generating big numbers of high scoring records, or for beating records using fewer samples than others to impress the people, use Swarm Wave.

impulsecorp commented 6 years ago

The error I am getting seems related to rendering (which is a problem on an AWS Linux server), so I will figure it out. If I change this to False: s.run_swarm(print_swarm=False)
then it shows it ran it with no errors: CPU times: user 13.6 s, sys: 280 ms, total: 13.9 s Wall time: 1min 11s

Guillemdb commented 6 years ago

This shouldn't be happening. If you run into more troubles use my personal fork that I have not merged yet. I will make the pull request ASAP, but meanwhile you can use this: https://github.com/Guillemdb/FractalAI/tree/learning

impulsecorp commented 6 years ago

I downloaded the new version https://github.com/Guillemdb/FractalAI/tree/learning at it fixed that problem. Swarm_Wave_example.ipynb works now, but there is a different error:

Get the observations belonging to the highest score game

obs = s.tree.get_branch(leaf_id= s.walkers_id[s.rewards.argmax()]) s.rewards.tolist() # Uncomment to see the scores for each walker

AttributeError Traceback (most recent call last)

in () 1 # Get the observations belonging to the highest score game ----> 2 obs = s.tree.get_branch(leaf_id= s.walkers_id[s.rewards.argmax()]) 3 s.rewards.tolist() # Uncomment to see the scores for each walker AttributeError: 'NoneType' object has no attribute 'get_branch' ------------------------------------------ Also, unrelated to that, I don't see the training runs saved anywhere for me to use them. How do I access them?

impulsecorp commented 6 years ago

When I try your new FMC_example.ipynb it saves the video files in /videos but I want the SARS data from the training runs, like to use in DQN. That is why I was using the swarm version.

Guillemdb commented 6 years ago

My advice is that you forget about the demos in the FractalAI repository. They are totally outdated. I will try to make them work again today, but maybe it takes some time because I am currently at the EuroPython conferences.

Meanwhile if you are using my fork, use this notebook as an example https://github.com/Guillemdb/hacking-rl/blob/master/demo.ipynb I tried it yesterday on my talk, and both algorithms worked well.

To generate data data. in the cell number [21], change mode="load" for mode=save, and make sure the target folder exists. That should be enough to create some data, but you can also use the included data generators found in the last cells.

impulsecorp commented 6 years ago

Thanks, the page at https://github.com/Guillemdb/hacking-rl/blob/master/demo.ipynb is exactly what I needed. I don't need you to fix the old demos.

impulsecorp commented 6 years ago

At first I had a problem with the new demo at https://github.com/Guillemdb/hacking-rl/blob/master/demo.ipynb, but disabling NetworkX (http://networkx.github.io) fixed it.

FragileTech / FractalAI

Swarm Wave Example Error #90

Also, your FMC example works on my server, but your Swarm example at https://github.com/FragileTheory/FractalAI/blob/master/swarm_wave_example.ipynb does not work, I get this error:

I downloaded the new version https://github.com/Guillemdb/FractalAI/tree/learning at it fixed that problem. Swarm_Wave_example.ipynb works now, but there is a different error:

Get the observations belonging to the highest score game

obs = s.tree.get_branch(leaf_id= s.walkers_id[s.rewards.argmax()]) s.rewards.tolist() # Uncomment to see the scores for each walker