Closed robdancer closed 3 months ago
Thanks! Yeah could this be split into two PRs? Usually it's fine to have both in one, but these features cover quite different areas so are probably best split.
I wonder if we could depend on https://github.com/JuliaGraphs/Graphs.jl? If we set up all of the right methods in a DynamicExpressionsGraphsExt
, we should get the plotting "for free" – https://juliagraphs.org/Graphs.jl/stable/first_steps/plotting/ and not need our own custom plotting function.
Although will need https://github.com/JuliaGraphs/Graphs.jl/issues/395
Thanks! Yeah could this be split into two PRs? Usually it's fine to have both in one, but these features cover quite different areas so are probably best split.
Agreed - two PRs is best. Let me close this pull request and I'll set up two new ones.
Two main features - they could be split and PR'd separately if that's preferable
Graph evaluator caching
The
GraphNode
struct now has an additional property, theAbstractArray
cache, not initialised during object creation. This enables the result of each node evaluation to be stored and retrieved without the need for a dictionary lookup.A
GraphNode
can now also have theconstant
property set to true during evaluation if it is an operator node with constant operand nodes. In such a case theval
property will also be set to the result of the single operation required. This behaviour won't occur in sufficiently optimised expressions, as a subexpression consisting of operators with constant operand nodes could be replaced with a single constant node. This doesn't meaningfully mutate the expression during the evaluation, as theconstant
property is otherwise meaningless for nodes with degree >0.In terms of the evaluator itself, there are two implemented versions: one using
Base.map
, and one usingLoopVectorization.vmapnt
which is used when the LoopVectorization package is loaded. Some rough tests I did found that, on a graph with no shared subexpressions, the graph evaluator takes 1.4x as long as the corresponding (with or without LoopVectorization) tree evaluator. I might do some more work on reducing that, one method would be to quickly identify whether shared subexpressions exist in the expression and choose a corresponding evaluator. The relative performance of the graph evaluator operating on an expression with shared subexpressions is entirely dependent on the number and size of the shared subexpressions but has a theoretically unbounded performance improvement.Expression visualiser
Expressions using many repeated shared subexpressions can produce long string representations, making it difficult to understand the graph structure of the expression. An example of the visualiser:
Produces the output:
Red node represents the root, and the labels on each arrow signify the number of the operand.
I've also updated SymbolicRegression.jl that improves the
form_random_connection!
mutation but it relies on the newrandomised_topological_sort
function added to DynamicExpressions in this PR.Would be very happy to hear any and all feedback!