Closed xinlnix closed 1 year ago
Hi @xinlnix, I'm sorry that you are disappointed with symbolic regression. Please keep in mind that SR with neural networks is still in its infancy !
However, even on vey complicated and noisy data, physo usually achieves to produce at least one symbolic model with R2 > 0.9. Did you setup your physical units properly ? It could be that physo is prevented from formulating a good model because of a bad units configuration.
I suggest disabling units constraints by leaving the units field empty. Can you share your code and/or your data ?
Cheers. Wassim
Thanks a lot for your relpy. Here is the link of the csv file. The inputs if vg
and vd
. The output is ids
in the csv file. The unit of vg
and vd
is V
. The unit of ids
is A
.
The update data link is : https://drive.google.com/file/d/1HbedKbnGL9C_lKaZBpN1ussg2W_xmMzC/view?usp=sharing
@WassimTenachi Thanks
Hi @xinlnix,
I noticed that your output "ids" contains extremely low values close to the machine epsilon at a very different scale than "vg" and "vd" making free constant optimisation very difficult. You should really scale ids these values such that $ids \in [0, 1]$
I would suggest running something along the line of:
expression, logs = physo.SR(X, y,
X_names = [ "vg" , "vd" ],
X_units = [ [1, 0] , [1, 0] ],
y_name = "ids",
y_units = [0, 1],
fixed_consts = [ 1. ],
fixed_consts_units = [ [0,0,0] ],
free_consts_names = [ "v0" , "v1" , "i0" ],
free_consts_units = [ [1, 0] , [1, 0] , [0, 1] ],
op_names = ["mul", "add", "sub", "div", "inv", "n2", "sqrt", "neg", "exp", "log", "sin", "cos"]
)
With a unit system [voltage, ampere] and 3 free constants (having units of input variables and output).
This should give you the best chances of resolving your modelling problem.
Please keep us updated on the results !
Cheers. Wassim
I tried the config and apply log
on the ids
. But the result is also not ideal. Here is the run reults. https://drive.google.com/drive/folders/10_WPO3msfw3saEEGJXaDezJ0d3GVHr7c?usp=sharing
Hi @xinlnix ,
Thanks for sharing your results ! I may be missing something but it sounds like results are very decent no ?
From what I see, physo has converged to this expression $f(vg,vd) = \frac{i{0} \left(v{0} e^{\frac{v{1}}{v{0} + vd + 10 vg}} + vg\right)}{v_{0}}$ with a fit coef of $R^2 = 0.97$ which is quite good !
The learning curves look normal and I have inspected the fit in a 3D plot:
It seems to fit pretty good except maybe at very low vd values (< 0.05) where your y points are lower than the values predicted by f, but this is probably due to the lack of data as only 8% of your data is in this feature going down as vd goes down.
If you want to perfect your fit, I would suggest duplicating low vd values so it has more weight in the reward.
If you want to check fit quality for yourself here is the code I used by the way:
import sympy
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from sklearn.metrics import r2_score
##
res_equation = sympy.simplify("(-(((-(vg)/v0)+-(exp((v1/(((((((((((v0+vg)+vg)+vd)+vg)+vg)+vg)+vg)+vg)+vg)+vg)+vg))))))*i0)")
print(res_equation)
print(sympy.printing.latex(res_equation))
def res_func(vg,vd):
i0 = -2.7882420371810275
v0 = 3.9984434760528837
v1 = 4.141095260217377
y = i0*(v0*np.exp(v1/(v0 + vd + 10*vg)) + vg)/v0
return y
##
# Calculate the R2 score
df = pd.read_csv("data.csv")
y_target = np.log10(df["ids"])
r2_score_result = r2_score(y_target[np.isfinite(y_target)], res_func(df["vg"], df["vd"])[np.isfinite(y_target)])
print("R2 score:", r2_score_result)
##
# Create a 3D plot
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
# Plot the 3D scatter plot
ax.scatter(df["vg"], df["vd"], res_func(df["vg"], df["vd"]), c='r', marker='.', alpha=0.1, label = "physo expr.")
ax.scatter(df["vg"], df["vd"], np.log10(df["ids"]), c='k', marker='.', alpha=0.1, label = "target")
# Set labels for the axes
ax.set_xlabel('vg')
ax.set_ylabel('vd')
ax.set_zlabel('ids')
ax.legend()
plt.show()
Hi @xinlnix,
I'm closing this issue now but don't hesitate to re-open it if you need further help.
I appreciate your response and the effort you've put in. While PhySO has managed to achieve an R2 value of 0.97, unfortunately, it falls short of meeting my specific requirements. In contrast, utilizing a Neural Network has enabled me to reach an impressive R2 value of 0.9999. However, the drawback I've encountered is the Neural Network too slow.
Once again, I want to express my gratitude for your patient and considerate response.
Dear @xinlnix,
Yes symbolic regression is not meant to outperform neural networks in fit quality, neural networks will always have better performances and are much easier to train since they are typically much more flexible.
The advantages of finding an equation are:
Unfortunately this does not include fit quality (at least not on the training range).
Take care. Wassim
I have tried some data using this project. But the accurate is very low and can not be used in the real. Is there someone achieve good accuracy on your own data?