kstaats / karoo_gp

A Genetic Programming platform for Python with TensorFlow for wicked-fast CPU and GPU support.
Other
157 stars 61 forks source link

Switch tree navigation from breadth-first to depth-first #89

Closed granawkins closed 1 year ago

granawkins commented 2 years ago

Karoo Classic uses a breadth-first-search ordering (BFS) when navigating trees. The new Node API makes heavy use of recursion, which is by nature depth-first (DFS). We've implemented workarounds for breadth-first to guarantee continuity of test results through all the recent changes.

We've decided to switch to DFS by default, and remove support for BFS because:

NOTE: We've discussed that Python handles recursion a bit differently, so there are some implicit limits. However, we don't expect to reach those limits (works fine at tree_depth_base=15), and if we do, they can be increased manually.

What's the impact on Trees?

'Grow' trees are generated stochastically. Starting from the root node, at every 'child' position, flip a coin to decide whether to add a function (with additional children) or a terminal (no children). So they result in irregularly-shaped trees.

We'd expect that a population of grow trees generated with BFS and DFS will have different 'shapes'. Here's a diagram showing that DFS trees have a higher probability of being deep and wide (i.e. have lots of children). dfs_bfs_diagram

I did an experiment where I generated 1000-each trees using DFS and BFS, and compare the frequency of depths and widths. Checks out. dfs_bfs_chart

So, expect a PR with this soon.

ezio-melotti commented 2 years ago

Just out of curiosity, have you tried to reach Python's recursion limit by increasing the tree depth? Also have you noticed any performance differences between the two algos on trees with a similar number of nodes?

kstaats commented 2 years ago

Very, very interesting. I am concerned for the lack of breadth in the new DFS method. Let's discuss how this might affect solutions.

Thank you!

On 8/23/22 23:04, Grant wrote:

Karoo Classic uses a breadth-first-search ordering (BFS) when navigating trees. The new Node API makes heavy use of recursion, which is by nature depth-first (DFS). We've implemented workarounds for breadth-first to guarantee continuity of test results through all the recent changes.

We've decided to switch to DFS by default, and remove support for BFS because:

  • We'd prefer to choose just one to support, so we don't have to write/update everything twice.
  • BFS is theoretically faster and light, requires less code, and we use a lot of recursion already.

NOTE: We've discussed that Python handles recursion a bit differently, so there are some implicit limits. However, we don't expect to reach those limits (works fine at tree_depth_base=15), and if we do, they can be increased manually.

What's the impact on Trees?

'Grow' trees are generated stochastically. Starting from the root node, at every 'child' position, flip a coin to decide whether to add a function (with additional children) or a terminal (no children). So they result in irregularly-shaped trees.

We'd expect that a population of grow trees generated with BFS and DFS will have different 'shapes'. Here's a diagram showing that DFS trees have a higher probability of being deep and wide (i.e. have lots of children). dfs_bfs_diagram

I did an experiment where I generated 1000-each trees using DFS and BFS, and compare the frequency of depths and widths. Checks out. dfs_bfs_chart

So, expect a PR with this soon.

granawkins commented 2 years ago

I found some mistakes in the DFS algorithm - the results above weren't actually correct, and didn't match what I was saying. These results do: dfs_bfs_chart_2

Let's discuss nevertheless.

@ezio-melotti When I was testing recursion, I got bored waiting around depth 17/18, so I never reached the limit. I will try some performance tests now.