About training data - Githubissues

swx-10 commented 1 year ago

Hello, what is the form of this training data? Can you give an example?

xinlnix commented 1 year ago

I also have the same question. It is great to show a case for a general Dataset. And It is better to explain the config various meaning. Thanks a lot.

WassimTenachi commented 1 year ago

Hi @swx-10 and @xinlnix,

There is now an example in the getting started section ! You can use the convenient physo.SR function by passing it a general dataset in the form of X and y arrays.

Here is the documentation for X and y:

X : numpy.array of shape (n_dim, ?,) of float
        Values of the input variables of the problem with n_dim = nb of input variables.
y : numpy.array of shape (?,) of float
        Values of the target symbolic function to recover when applied on input variables contained in X.

I'm working on a tutorial for specifying custom hyperparameters and for documenting them in one convinient place.

swx-10 commented 1 year ago

Hi@WassimTenachi ， Sorry, I still don't understand the details of the data input and output.Can you give me a more specific example of something like "hello --> 你好"? Thanks a lot.

WassimTenachi commented 1 year ago

(X,y) -> analytical function

Symbolic regression (SR) consists in the inference of a free-form symbolic analytical function $f: \mathbb{R}^n \longrightarrow \mathbb{R}$ that fits $y = f(x_0,..., x_n)$ given $(x_0,..., x_n, y)$ data.

The input data is a dataset (X,y) with associated units (length, time, mass etc) eg:

z = np.random.uniform(-10, 10, 50)
v = np.random.uniform(-10, 10, 50)
X = np.stack((z, v), axis=0)
y = 1.234*9.807*z + 1.234*v**2

Where z is a length of dimension $L^{1}, T^{0}, M^{0}$, v is a velocity of dimension $L^{1}, T^{-1}, M^{0}$, y if an energy of dimension $L^{2}, T^{-2}, M^{1}$. (Allowing the use of a fixed constant $1$ of dimension $L^{0}, T^{0}, M^{0}$ (ie dimensionless) and free constants $m$ of dimension $L^{0}, T^{0}, M^{1}$ and $g$ of dimension $L^{1}, T^{-2}, M^{0}$.)

The output is a mathematical expression, you can obtain it using PhySO by running:

expression, logs = physo.SR(X, y,
                            X_names = [ "z"       , "v"        ],
                            X_units = [ [1, 0, 0] , [1, -1, 0] ],
                            y_name  = "E",
                            y_units = [2, -2, 1],
                            fixed_consts       = [ 1.      ],
                            fixed_consts_units = [ [0,0,0] ],
                            free_consts_names = [ "m"       , "g"        ],
                            free_consts_units = [ [0, 0, 1] , [1, -2, 0] ],
                            op_names = ["mul", "add", "sub", "div", "inv", "n2", "sqrt", "neg", "exp", "log", "sin", "cos"]
)

You can inspect the recovered mathematical expression via:

>>> print(expression.get_infix_pretty(do_simplify=True))
  ⎛       2⎞
m⋅⎝g⋅z + v ⎠
>>> print(expression.get_infix_latex(do_simplify=True))
'm \\left(g z + v^{2}\\right)'

Free constants can be inspected via:

>>> print(expression.free_const_values.cpu().detach().numpy())
array([9.80699996, 1.234     ])

physo.SR also returns the log of the run from which one can inspect Pareto front expressions:


for i, prog in enumerate(pareto_front_expressions):
    # Showing expression
    print(prog.get_infix_pretty(do_simplify=True))
    # Showing free constant
    free_consts = prog.free_const_values.detach().cpu().numpy()
    for j in range (len(free_consts)):
        print("%s = %f"%(prog.library.free_const_names[j], free_consts[j]))
    # Showing RMSE
    print("RMSE = {:e}".format(pareto_front_rmse[i]))
    print("-------------")

Returning:

   2
m⋅v 
g = 1.000000
m = 1.486251
RMSE = 6.510109e+01
-------------
g⋅m⋅z
g = 3.741130
m = 3.741130
RMSE = 5.696636e+01
-------------
  ⎛       2⎞
m⋅⎝g⋅z + v ⎠
g = 9.807000
m = 1.234000
RMSE = 1.675142e-07
-------------

WassimTenachi commented 1 year ago

Hi @swx-10, can you tell me if this resolved your documentation problem so I can close this thread ?

swx-10 commented 1 year ago

Hi @swx-10, can you tell me if this resolved your documentation problem so I can close this thread ?

yes，I'm sorry for the late reply.Thanks

WassimTenachi / PhySO

About training data #15