ehsanhaghighat / sciann

Deep learning for Engineers - Physics Informed Deep Learning
htttp://sciann.com
Other
325 stars 116 forks source link

Error - 'sciann.functionals' has no attribute 'mlp_functional' #79

Open ssingh-ipa opened 1 year ago

ssingh-ipa commented 1 year ago

I am getting the following error when I execute a PDE using sciann.

x = sn.Variable('x', dtype=dtype) tao = sn.Variable('tao', dtype=dtype) C_n = sn.Functional('C_n', [x,tao], [10,20,40,20,10], act) PDE = (xdiff(C_n, tao)) - ((2diff(C_n,x) + x*diff(C_n, x, order=2)))

Input In [63] in <cell line: 9> PDE = (xdiff(C_n, tao)) - ((2diff(C_n,x) + x*diff(C_n, x, order=2)))

File ~\Anaconda3\lib\site-packages\sciann\utils\math.py:1358 in grad return _gdiff("Grad", f, *args, **kwargs)

File ~\Anaconda3\lib\site-packages\sciann\utils\math.py:1277 in _gdiff assert is_functional(f), \

File ~\Anaconda3\lib\site-packages\sciann\utils\validations.py:25 in is_functional if isinstance(f, (sciann.functionals.mlp_functional.MLPFunctional,

AttributeError: module 'sciann.functionals' has no attribute 'mlp_functional'

Please suggest how to solve this.

linuswalter commented 1 year ago

HI @ssingh-ipa, I reproduced your case with the following code, but did not receive any error message :thinking: . Maybe you can give a bit more context of your problem.

import sciann as sn
act = "tanh"
x = sn.Variable('x')
tao = sn.Variable('tao')
C_n = sn.Functional('C_n', [x,tao], [10,20,40,20,10], act)
PDE = (sn.diff(C_n, tao)) - ((2*sn.diff(C_n,x) + x*sn.diff(C_n, x, order=2)))
ssingh-ipa commented 1 year ago

Hi @linuswalter,

I ran this code snippet in JupyterNotebook and it works without errors. However, I switched to the new version of sciann and ran the same code snippet in Spyder. That's when I started getting this mlp_functional error. I have now switched to the old version, and the code works now.

I am now running into another issue related to the size of my dataset. Please help me with this one. Here is my code snippet:

import sciann as sn
act = "tanh
x = sn.Variable('x', dtype=dtype)
tao = sn.Variable('tao', dtype=dtype)
C_n = sn.Functional('C_n', [x,tao], [10,20,40,20,10], act)
PDE_neg  = (x*diff(C_n, tao)) - ((2*diff(C_n,x) + x*diff(C_n, x, order=2)))
IC = (tao==0)*(0.<=x<=1.)*(C_n-1)
BC1 = (x==0)*(tao>0)*diff(C_n,x)
BC2 = (x==1)*(tao>0)*(diff(C_n,x)-norm_Jn) #norm_Jn is an array of size (26362,)

model_n = sn.SciModel([x, tao], [PDE_neg,IC,BC1,BC2], loss_func="mse", optimizer="adam")

tao_value_neg = Dsneg*Time/(Rneg**2) #tao_value_neg is also an array of size (26362,)
normalized_tao_neg = NormalizeData(tao_value_neg).to_numpy()
tao_input_neg = normalized_tao_neg.flatten()

x_input_n = np.linspace(0, 1, 200)
x_input_n_mesh, tao_input_n_mesh = np.meshgrid(
    x_input_n, 
    tao_input_neg
)
x_in_n = np.reshape(x_input_n_mesh, (-1))
tao_in_n = np.reshape(tao_input_n_mesh, (-1))

h_n = model_n.train([x_in_n,tao_in_n],
                4*['zero'],
                learning_rate=0.001, 
                epochs=200, 
                stop_loss_value=1e-10,
                reduce_lr_after=15,
                stop_lr_value=1e-8,
                verbose=1, 
                batch_size=256,
                shuffle=True,
                validation_data=None,__

I get the following memory error: MemoryError: Unable to allocate 1.01 TiB for an array with shape (5272400, 26362) and data type float64.

It is not clear in the documentation how to assign the inputs for training the model. Also, my input files are large files of experimental data. Please suggest how to solve this issue.

linuswalter commented 1 year ago

@ssingh-ipa, I think you should prepare your target data differently. In BC2 = (x==1)*(tao>0)*(diff(C_n,x)-norm_Jn) you could pass norm_Jn if it is a scalar value, but if you pass an array, it is not clear at which exact point of tao you assign which target value for you Functional C_n. Also, your input data x_input_n need to be randomly sampled in your domain, don't pass them in sorted order to the NN. The best is to check the SciANN DataGenerator to understand the exact data structures that you need. You can follow the application example about Mandel's problem at which the DataGenerator is applied.

ssingh-ipa commented 1 year ago

@linuswalter Thank you. x_input_n is a normalized value, so yes, I can try to sample it randomly as done in Mandel's problem.

However, norm_Jn is a function of a parameter (Current), which is a time series input. Hence the physical meaning of BC2 is that at x=1 and tao>0, dC/dx = I(t)*constants

This is a Neumann boundary condition, and I couldn't find similar examples for reference. What would you suggest for this?

linuswalter commented 1 year ago

@ssingh-ipa ah, I think I understand your problem better now. Well if norm_Jn is a function of t, then you can assign a sn.Functional to it and train it on your dataset (the one with size 26362). After this first training step, you set it norm_Jn.trainable(False) and train your main model model_n. Something like this:

import sciann as sn
act = "tanh
x = sn.Variable('x', dtype=dtype)
tao = sn.Variable('tao', dtype=dtype)

norm_Jn = sn.Functional('norm_Jn', [tao], 4*[10], "tanh")
C_n = sn.Functional('C_n', [x,tao], [10,20,40,20,10], act)

PDE_neg  = (x*diff(C_n, tao)) - ((2*diff(C_n,x) + x*diff(C_n, x, order=2)))
IC = (tao==0)*(0.<=x<=1.)*(C_n-1)
BC1 = (x==0)*(tao>0)*diff(C_n,x)
BC2 = (x==1)*(tao>0)*(diff(C_n,x)-norm_Jn) #norm_Jn is an array of size (26362,)
targets_norm_Jn = norm_Jn

mod_norm_Jn = sn.SciModel([tao],[targets_norm_Jn], optimizer="adam",loss_func="mse")
model_n = sn.SciModel([x, tao], [PDE_neg,IC,BC1,BC2], loss_func="mse", optimizer="adam")

tao_value_neg = Dsneg*Time/(Rneg**2) #tao_value_neg is also an array of size (26362,)
normalized_tao_neg = NormalizeData(tao_value_neg).to_numpy()
tao_input_neg = normalized_tao_neg.flatten()

x_input_n = np.linspace(0, 1, 200)
x_input_n_mesh, tao_input_n_mesh = np.meshgrid(
    x_input_n, 
    tao_input_neg
)
x_in_n = np.reshape(x_input_n_mesh, (-1))
tao_in_n = np.reshape(tao_input_n_mesh, (-1))

norm_Jn.set_trainable(True)
mod_norm_Jn.compile()

H_norm_Jn = mod_norm_Jn.train([tao_in_n], target_values_norm_Jn)

norm_Jn.set_trainable(False)
mod_norm_Jn.compile()

H_C_n = model_n.train(...)

Well, the values for target_values_norm_Jn are probably non-zero, right? So you have to pass in a certain data structure. Example:

target_values_norm_Jn = [(Array_of_indices,Array_of_target_values)]

The shape of Array_of_indices and Array_of_target_values is identical.
The Array_of_indices is dtype=int64 and contains the indices of the input dataset tao_in_n, at which each corresponding target value in Array_of_target_values is assigned.

The following example is from the data generator in 1D with 4 different targets. Please note, that each column of the Input Dataset is in reality an extra array in the input_list and needs to have the shape (-1,1) . I have just put in a Pandas DataFrame for visualization purposes.

Dok_py_SciANN_DataGenerator 2022-06-30 11 46 34 excalidraw

In your case, you have to assign values different than zero in Array_of_target_values. Hope my description helps you a little bit. I also really recommend you to understand the DataGenerator and to adapt the code to your needs.

Best, Linus

ssingh-ipa commented 1 year ago

Hi Linus @linuswalter,

This definitely helps. Thanks a lot.

I have 2 open points:

  1. You trained mod_norm_Jn with the input data tao_in_n in a separate network called H_norm_Jn. But then how is H_norm_Jn integrated with the final network- H_C_n? Does that happen automatically with respect to tao_in_n?

  2. Secondly, as you know, tao_in_n is reshaped from a meshgrid. This creates a length discrepancy, i.e, the len(Array_of_target_values) is 86596 while that of len(tao_in_n) = 17319200. Should I use (x_input_n_mesh, tao_input_n_mesh) as the inputs for training the model? To phrase it otherwise, is the 1D reshape not useful anymore? x_in_n = np.reshape(x_input_n_mesh, (-1)) tao_in_n = np.reshape(tao_input_n_mesh, (-1))

The meshgrid issue will be solved once I integrate the Datagenerator in my code. For reference, are there any examples where XT data generator is implemented?

Best, Soumya

linuswalter commented 1 year ago

@ssingh-ipa

Hi Soumya, for the late reply.

  1. Actually, norm_Jn is the neural network. We train it in the model setup mod_norm_Jn. Then, we can use the trained NN norm_Jn in the next modeling setup mod_n in which we want to train your main NN C_n.
  2. I am not sure if I understand this correctly. I think the general data structure that you create for (x_input_n_mesh, tao_input_n_mesh) via 1D reshape is correct. But for the array with len(Array_of_target_values) = 86596 you need to create an extra input array of collocation points. Please let me know if you were able to solve your problem.

Actually, an alternative to representing norm_Jn by an extra NN is to change the definition of your loss term BC2. Your current definition is $$\mathtt{BC2}(x=1,t>0): \qquad \frac{\partial{C_\text{n}}}{\partial{t}} - \mathtt{norm_jn}=0$$ which is written in the code via

BC2 = (x==1)*(tao>0)*(diff(C_n,x)-norm_Jn)

Instead, you could define $$\mathtt{BC2}(x=1,t>0): \qquad \frac{\partial{C_\text{n}}}{\partial{t}} = \mathtt{norm_jn}$$ which would be implemented via

BC2 = (x==1)*(tao>0)*(diff(C_n,x)) 

and assign the array norm_Jn as targets. That means you need to create input arrays/collocation points that respect x=0 and t>0. It is important that you assign an input value at the array x and t for each target value norm_Jn. So instead of assigning zeros, you assign the respective values of norm_Jn.

Actually, the application Example of Terzaghi uses the DataGeneratorXT: https://github.com/sciann/sciann-applications/tree/master/SciANN-PoroElasticity

I hope this helps you a bit. Please let me know if something wasn't clear.

Best, Linus

ssingh-ipa commented 1 year ago

@linuswalter

Hi Linus,

The first approach (Separate network for norm_Jn) has definitely improved my model prediction. However, it is not optimal enough for predicting C_n. Some context here: I am training the PINN to predict the change in concentration across the electrodes of a Li-ion battery. This is done through the 2nd Fick's diffusion law and its respective Neumann boundary conditions. norm_Jn is the flux density.

Please find the updated code below:

----------------------- Neural Network Setup -----------------------

sn.reset_session()
sn.set_random_seed(1234)
x = sn.Variable('x', dtype=dtype)
tao = sn.Variable('tao', dtype=dtype)

norm_Jn = sn.Functional('norm_Jn', [tao], 4*[10], 'sigmoid')
C_n = sn.Functional('C_n', [x,tao], 8*[20], act)

PDE_neg  = (x*diff(C_n, tao)) - ((2*diff(C_n,x) + x*diff(C_n, x, order=2)))
IC = (tao==0)*(0.<=x<=1.)*(C_n-1)
BC1 = (x==0)*(tao>0)*diff(C_n,x)
BC2 = (x==1)*(tao>0)*(diff(C_n,x)-norm_Jn)
targets_norm_Jn = norm_Jn

targets_PDE = [sn.PDE(PDE_neg), IC, BC1, BC2]

model_pred = sn.SciModel([x, tao], [norm_Jn, C_n])  

mod_norm_Jn = sn.SciModel([tao],[targets_norm_Jn], optimizer="adam",loss_func="mse")
model_n = sn.SciModel([x, tao], targets_PDE, loss_func="mse", optimizer="adam")

tao_value_neg = Dsneg*Time/(Rneg**2)
normalized_tao_neg = NormalizeData(tao_value_neg).to_numpy()
tao_input_neg = normalized_tao_neg

Array_of_indices= np.arange(len(tao_input_neg)).astype(np.int64)
Array_of_target_values=normalized_J_n

target_values_norm_Jn = [(Array_of_indices,Array_of_target_values)]

norm_Jn.set_trainable(True)
mod_norm_Jn.compile()

# In[13]:
#----training parameters-------
NUM_SAMPLES = 100000
BATCH_SIZE = 1000  # higher batch size results in more accuracy
BATCH_SIZE_Jn = 300
EPOCHS_PDE = 500   # make sure (NUM_SAMPLES/BATCH_SIZE)*EPOCHS > 50K (total gradient updates)
EPOCHS_Jn = 500
STOP_AFTER = None

ADAPTIVE_WEIGHTS = {'method': 'GN', 'freq':300, 'use_score':True, 'alpha':1.0} 
ADAPTIVE_WEIGHTS_Jn = {'method': 'NTK', 'freq':200}

initial_lr = 1e-3
final_lr = initial_lr/100

learning_rate_PDE = {
    "scheduler": "ExponentialDecay", 
    "initial_learning_rate": initial_lr,
    "final_learning_rate": final_lr, 
    "decay_epochs": EPOCHS_PDE
}

learning_rate_Jn = {
    "scheduler": "ExponentialDecay", 
    "initial_learning_rate": initial_lr,
    "final_learning_rate": final_lr, 
    "decay_epochs": EPOCHS_Jn
}

my_callback_J = tf.keras.callbacks.EarlyStopping(monitor='loss',patience=15,verbose=2)
my_callback = tf.keras.callbacks.EarlyStopping(monitor='loss',patience=30,verbose=2)

H_norm_Jn = mod_norm_Jn.train([tao_input_neg], 
                              target_values_norm_Jn,
                              learning_rate=learning_rate_Jn,
                              epochs=EPOCHS_Jn, 
                              callbacks=[my_callback_J],
                              batch_size=BATCH_SIZE_Jn, 
                              stop_loss_value=1e-10, 
                              #reduce_lr_after=15, 
                              stop_lr_value=1e-8, 
                              adaptive_weights=ADAPTIVE_WEIGHTS_Jn,
                              verbose=1
                             ) 
norm_Jn.set_trainable(False)
mod_norm_Jn.compile()   

td_0 = 0.0 
td_f = len(tao_input_neg)   
xd_min = 0.0 
xd_max = 1.0 

dg_target = DataGeneratorXT(
    X=[xd_min,xd_max], T=[td_0,td_f], 
    num_sample=NUM_SAMPLES,
    targets=['domain', 'ic', 'bc-left', 'bc-right']
)
input_data_target, target_data_PDE = dg_target.get_data()

dg_target.plot_data()

H_C_n = model_n.train(input_data_target, 
                      target_data_PDE,
                      learning_rate=learning_rate_PDE, 
                      epochs=EPOCHS_PDE, 
                      callbacks=[my_callback],
                      stop_loss_value=1e-9,
                      stop_lr_value=1e-8,
                      verbose=2,
                      batch_size=BATCH_SIZE,
                      adaptive_weights=ADAPTIVE_WEIGHTS
                     )

Now, when I run the NN norm_Jn separately, i.e., without C_n, the prediction is quite good as shown below: 1

However, when norm_Jn is evaluated with respect to the above-mentioned code, the prediction degrades, adversely affecting my PINN's accuracy. The new prediction is shown below: 2

What could be the reason for this?

Alternative:

I also tried the alternative you mentioned, i.e., assigning the respective values of norm_Jn. However, my data generator is 2D wrt x and t and when the BC2 is trained outside of these collocation points, my code does not execute anymore.

sn.reset_session()
sn.set_random_seed(1234)
x = sn.Variable('x', dtype=dtype)
tao = sn.Variable('tao', dtype=dtype)

C_n = sn.Functional('C_n', [x,tao], 8*[20], act)
PDE_neg  = (x*diff(C_n, tao)) - ((2*diff(C_n,x) + x*diff(C_n, x, order=2)))
IC = (tao==0)*(0.<=x<=1.)*(C_n-1)
BC1 = (x==0)*(tao>0)*diff(C_n,x)
BC2 = (x==1)*(tao>0)*diff(C_n,x)
targets_PDE = [sn.PDE(PDE_neg), IC, BC1] 
model_n = sn.SciModel([x, tao], [sn.PDE(PDE_neg),IC,BC1,BC2], loss_func="mse", optimizer="adam")
#model_n = sn.SciModel([x, tao], [targets_PDE,BC2], loss_func="mse", optimizer="adam")

td_0 = 0.0 
td_f = len(tao_input_neg)   
xd_min = 0.0 
xd_max = 1.0 

dg_target = DataGeneratorXT(
    X=[xd_min,xd_max], T=[td_0,td_f], 
    num_sample=NUM_SAMPLES,
    targets=['domain', 'ic', 'bc-left', 'bc-right']
)
input_data_target, target_data_PDE = dg_target.get_data()
dg_target.plot_data()

H_C_n = model_n.train(input_data_target, 
                      [target_data_PDE,Array_of_target_values],
                      learning_rate=learning_rate_PDE, 
                      epochs=EPOCHS_PDE, 
                      callbacks=[my_callback],
                      stop_loss_value=1e-9,
                      stop_lr_value=1e-8,
                      verbose=2,
                      batch_size=BATCH_SIZE,
                      adaptive_weights=ADAPTIVE_WEIGHTS
                     )

I get the following error:

--> 351 assert len(y_true)==len(self._constraints), \ 352 'Miss-match between expected targets (constraints) defined in SciModel and ' \ 353 'the provided y_trues - expecting the same number of data points. ' 355 num_sample = x_true[0].shape[0] 356 assert all([x.shape[0]==num_sample for x in x_true[1:]]), \ 357 'Inconsistent sample size among Xs. '

AssertionError: Miss-match between expected targets (constraints) defined in SciModel and the provided y_trues - expecting the same number of data points.

The alternative seems simpler but I am not able to get this working. What do you think?

Best, Soumya