Closed dlubomski closed 2 years ago
While the genetic algorithm part should work fine, unfortunately the constant optimizer, which tries to approximate gradients, will not work with discrete spaces. It is best to simply convert your dataset and all the operators to real numbers. If it finds the correct equation with the real numbered extensions of everything, you are golden!
This looks like a nice way to convert operators: https://stackoverflow.com/a/46674398/2689923.
e.g., (in Julia)
NOT(a) = (1-a)
AND(a, b) = a * b
OR(a, b) = a + b - AND(a, b)
XOR(a, b) = AND(OR(a, b), NOT(AND(a, b)))
Then you pass NOT
as a unary operator, and AND, OR, XOR
as binary operators to the SymbolicRegression Options
. I have no idea if this will work but sounds fun to try!
But it could be interesting to implement native support for discrete relations in the future.
Cheers, Miles
I got curious and I'm very happy to report that this actually works!
Here's some code for the Python frontend PySR (you mentioned you were new to Julia) to run this:
import numpy as np
from pysr import pysr
# True equation:
truth = lambda x: (x[0] or x[1]) and (x[2] or x[3])
# Generate random binary numbers:
X = 1.0 * (np.random.randn(100, 5) > 0.5)
y = np.array([truth(x.astype(np.bool)) for x in X]).astype(np.float32)
binary_operators = [
"AND(x, y) = x * y",
"OR(x, y) = x + y - x * y",
"XOR(x, y) = AND(OR(x, y), NOT(AND(x, y)))",
]
unary_operators = [
"NOT(x) = 1 - x",
]
equations = pysr(
X,
y,
niterations=5,
binary_operators=binary_operators,
unary_operators=unary_operators,
)
print(equations)
The output is:
Complexity | MSE | Equation |
---|---|---|
1 | 0.17 | x2 |
3 | 0.13 | AND(x3, x1) |
5 | 0.08 | AND(OR(x0, x1), x3) |
7 | 0.0 | AND(OR(x0, x1), OR(x3, x2)) |
Closing this for now. Let me know if you have other questions.
Is it possible to switch from Float to UInt8 numbers ? I`m new in Julia.
I would like to have SymbolicRegression.jl work in discrete numbers and then be able to use binary operators.