MilesCranmer commented 1 month ago

Expressions

This creates the Expression and AbstractExpression type for attaching operators and potentially other metadata to the expression itself.

This will likely be most useful as user-facing type to return self-contained expressions. I think it can also be useful for expression searches that attach other types of metadata.

This means you can do things like

expression(X)
# ^ Rather than `tree(X, operators)`

println(expression)
# ^ Rather than `print_tree(expression; operators, variable_names)`

without needing to provide the operators or variable names – they are stored inside the Expression struct.

Previously you needed to keep track of these yourself which is a bit of a pain if you aren't deeply knowledgeable about how everything works.

Parsing

In addition to this, I also add @parse_expression macro. This is a safer and easier alternative to operator overloading for creating Node types.

Basically it works as follows:

using DynamicExpressions

my_op(x) = cos(x^2)

operators = OperatorEnum(binary_operators=[+, -, *, /], unary_operators=[my_op])
variable_names = [:α, :β]  # Or strings

expr = @parse_expression(
    my_op(2.3 * α - β) / (0.5 * β * β),  # Pass an expression that will be parsed into an equivalent `Node`
    operators = operators,
    variable_names = variable_names,
)

X = randn(2, 100)

expr(X)
# ^ This is now valid, because `expr.operators` points to the operator enum.
# Also valid: `eval_tree_array(expr, X)`

println(expr)
# ^ Has the `α` and `β` despite everything being stored as the lightweight `Node` type!

expr'(X)
# ^ Forward-mode gradient with respect to inputs

You can also optionally provide a node_type argument to specify e.g., use of a GraphNode instead of a Node.

This is much nicer because it means you can have multiple sets of operator enums, and generate expressions without worrying about overwriting the global mapping from function to index.

TODO:

[x] Rewrite Expression to allow anything being set for the operators field (including nothing). And rewrite the field to be settings so its clear its not specifically operators.
[x] Allow variable_names to be nothing.
[x] Rewrite options function to allow passing both the expression (in case it contains its own settings), AND an options variable that by default is nothing. This would let you work with an expression that does not refer to its own options.
[x] Similar^ interface for variable_names.
[x] Create other field in Expression for referring to arbitrary metadata.
[x] Test a mechanism for MultiExpression and ParametricExpression

Tagging in case of interest @avik-pal @AlCap23 @rikhuijzer @Moelf

coveralls commented 1 month ago

Pull Request Test Coverage Report for Build 9182557303

Details

234 of 239 (97.91%) changed or added relevant lines in 6 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+0.6%) to 95.429%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
src/Expression.jl	88	89	98.88%
src/Parse.jl	107	111	96.4%
<!--	Total:	234	239	97.91%	-->

Totals
Change from base Build 8876266483:	0.6%
Covered Lines:	1858
Relevant Lines:	1947

💛 - Coveralls

daloic commented 1 month ago

With my very limited Julia knowledge, this looks like solving the issues I have and probably allowing the end user to later easily create a backbone expression from a parsed formula to feed the search with a seed. I like it.

github-actions[bot] commented 1 month ago

Benchmark Results

	master	703058e3396d7f...	master/703058e3396d7f...
eval/ComplexF32/evaluation	7.43 ± 0.45 ms	7.46 ± 0.47 ms	0.995
eval/ComplexF64/evaluation	9.64 ± 0.7 ms	9.74 ± 0.67 ms	0.99
eval/Float32/derivative	10.8 ± 1.6 ms	10.7 ± 1.6 ms	1.01
eval/Float32/derivative_turbo	10.7 ± 1.5 ms	10.7 ± 1.5 ms	1
eval/Float32/evaluation	2.74 ± 0.22 ms	2.69 ± 0.22 ms	1.02
eval/Float32/evaluation_bumper	0.569 ± 0.013 ms	0.528 ± 0.013 ms	1.08
eval/Float32/evaluation_turbo	0.682 ± 0.024 ms	0.699 ± 0.034 ms	0.975
eval/Float32/evaluation_turbo_bumper	0.569 ± 0.013 ms	0.534 ± 0.013 ms	1.07
eval/Float64/derivative	13.9 ± 0.49 ms	13.7 ± 0.53 ms	1.02
eval/Float64/derivative_turbo	13.9 ± 0.57 ms	13.7 ± 0.53 ms	1.02
eval/Float64/evaluation	2.88 ± 0.24 ms	2.88 ± 0.23 ms	1
eval/Float64/evaluation_bumper	1.29 ± 0.043 ms	1.19 ± 0.044 ms	1.08
eval/Float64/evaluation_turbo	1.16 ± 0.058 ms	1.17 ± 0.063 ms	0.99
eval/Float64/evaluation_turbo_bumper	1.3 ± 0.044 ms	1.19 ± 0.043 ms	1.09
utils/combine_operators/break_sharing	0.0407 ± 0.0013 ms	0.0414 ± 0.0013 ms	0.982
utils/convert/break_sharing	28 ± 0.72 μs	27.9 ± 0.82 μs	1
utils/convert/preserve_sharing	0.128 ± 0.0023 ms	0.127 ± 0.0024 ms	1.01
utils/copy/break_sharing	29.5 ± 0.77 μs	28.9 ± 0.9 μs	1.02
utils/copy/preserve_sharing	0.127 ± 0.0023 ms	0.127 ± 0.0028 ms	0.996
utils/count_constants/break_sharing	12.7 ± 0.19 μs	10.4 ± 0.14 μs	1.23
utils/count_constants/preserve_sharing	0.111 ± 0.002 ms	0.11 ± 0.0022 ms	1.01
utils/count_depth/break_sharing	14.1 ± 0.25 μs	12.5 ± 0.19 μs	1.13
utils/count_nodes/break_sharing	12.9 ± 0.2 μs	9.88 ± 0.16 μs	1.3
utils/count_nodes/preserve_sharing	0.112 ± 0.002 ms	0.115 ± 0.0057 ms	0.972
utils/get_set_constants!/break_sharing	0.0528 ± 0.00078 ms	0.0525 ± 0.00086 ms	1.01
utils/get_set_constants!/preserve_sharing	0.316 ± 0.005 ms	0.319 ± 0.0057 ms	0.99
utils/has_constants/break_sharing	5.14 ± 0.29 μs	4.75 ± 0.22 μs	1.08
utils/has_operators/break_sharing	1.78 ± 0.016 μs	1.65 ± 0.019 μs	1.08
utils/hash/break_sharing	30.1 ± 0.43 μs	0.033 ± 0.00046 ms	0.91
utils/hash/preserve_sharing	0.131 ± 0.002 ms	0.135 ± 0.003 ms	0.968
utils/index_constants/break_sharing	27.3 ± 0.55 μs	27.4 ± 0.59 μs	0.997
utils/index_constants/preserve_sharing	0.126 ± 0.0024 ms	0.128 ± 0.0033 ms	0.983
utils/is_constant/break_sharing	5.73 ± 0.31 μs	4.68 ± 0.22 μs	1.23
utils/simplify_tree/break_sharing	0.181 ± 0.016 ms	0.154 ± 0.015 ms	1.17
utils/simplify_tree/preserve_sharing	0.293 ± 0.017 ms	0.277 ± 0.016 ms	1.06
utils/string_tree/break_sharing	0.487 ± 0.0095 ms	0.502 ± 0.016 ms	0.97
utils/string_tree/preserve_sharing	0.63 ± 0.012 ms	0.644 ± 0.016 ms	0.978
time_to_load	0.203 ± 0.0017 s	0.213 ± 0.0025 s	0.953

rikhuijzer commented 1 month ago

Tagging in case of interest @avik-pal @AlCap23 @rikhuijzer @Moelf

Thanks indeed for the reminder that I should try out this package. I do have some use-cases where it would be useful for