Closed Ryszard2 closed 3 years ago
The training loss is so small, but the solution is not correct. If the code is correct, then one possible reason is that your solution is O(0.001). Try to rescale your problem such that it is O(1).
Thank you for the recommendation, @lululxvi , the quality of the solution improved dramatically.
Now, since there is still some significant gap between the NN outcome and the analytic solution, I'm trying to replicate a procedure presented in https://doi.org/10.1016/j.cma.2019.112789 for a Riemann problem, the very same problem I'm facing:
The problem is that I can't get the L-BFGS-B optimizer to iterate up to that epsilon condition: it stops way before. I tried this:
model.compile('adam', lr=0.0005)
model.train(epochs = 8000)
model.compile('L-BFGS-B')
early_stopping = EarlyStopping(min_delta=1e-16, patience=200000)
model.train(epochs = 200000, callbacks=[early_stopping])
but it didn't work. What's wrong?
Thank you, Riccardo
@Ryszard2 is it possible for you to upload the final code?
@lululxvi How can we rescale the problem? can you provide a sample code for that?
@Ryszard2
@katayooneshkofti See FAQ.
@katayooneshkofti
Yes, here's the code. Beware that, even if it seem pretty correct to me, I can't get to the scenario where the NN learns the correct solution with monster precision.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import matplotlib.pyplot as plt
import numpy as np
import deepxde as dde
import time as time
import matplotlib
from matplotlib import cm
from deepxde.backend import tf
from deepxde.callbacks import EarlyStopping
dde.config.real.set_float64()
dim_input = 2
dim_output = 2
scale_h = 1000.0
Time = 6.0
X_min = 0.0
X_max = 10.0
X_0 = 5.0
h_L = 0.005 * scale_h
h_R = 0.001 * scale_h
g = 9.81 / scale_h
def pde(x, y):
h = y[:, 0:1]
u = y[:, 1:2]
U1 = h
U2 = h*u
E1 = h*u
E2 = h*u*u + 0.5 * h*h*g
E1_x = tf.gradients(E1, x)[0][:, 0:1]
E2_x = tf.gradients(E2, x)[0][:, 0:1]
U1_t = tf.gradients(U1, x)[0][:, 1:2]
U2_t = tf.gradients(U2, x)[0][:, 1:2]
Sgx = 0.0
Sfx = 0.0
S1 = 0.0
S2 = g*h * (Sgx - Sfx)
equaz_1 = U1_t + E1_x - S1
equaz_2 = U2_t + E2_x - S2
return [equaz_1, equaz_2]
def on_initial(_, on_initial):
return on_initial
def boundary(_, on_boundary):
return on_boundary
def boundary_0 (x, on_boundary):
return on_boundary and np.isclose(x[0], X_min)
def boundary_L (x, on_boundary):
return on_boundary and np.isclose(x[0], X_max)
def func_IC_h(X):
x = tf.cast(X[:, 0:1], dtype=tf.float64)
c1 = tf.math.less_equal(x, X_0)
f1 = tf.ones_like(x) * h_L
f2 = tf.ones_like(x) * h_R
f3 = tf.where(c1, f1, f2)
return f3
def func_IC_u(x):
return 0.0
def func_BC_h1(x):
return h_L
def func_BC_h2(x):
return h_R
def func_BC_u(x):
return 0.0
geom = dde.geometry.Interval(X_min, X_max)
timedomain = dde.geometry.TimeDomain(0.0, Time)
geomtime = dde.geometry.GeometryXTime(geom, timedomain)
IC_h = dde.IC(geomtime, func_IC_h, on_initial, component = 0)
IC_u = dde.IC(geomtime, func_IC_u, on_initial, component = 1)
BC_h1 = dde.DirichletBC(geomtime, func_BC_h1, boundary_0, component = 0)
BC_h2 = dde.DirichletBC(geomtime, func_BC_h2, boundary_L, component = 0)
BC_u = dde.DirichletBC(geomtime, func_BC_u, boundary, component = 1)
BC = [IC_h, IC_u, BC_h1, BC_h2, BC_u]
data = dde.data.TimePDE(
geomtime, pde, BC,
num_domain = 6000,
num_boundary = 20,
num_initial = 200,
anchors = TOT_an)
net = dde.maps.FNN(layer_size = [dim_input] + [30]*4 + [dim_output],
activation = "tanh",
kernel_initializer = "Glorot uniform")
model = dde.Model(data, net)
model.compile('adam', lr=0.0005)
model.train(epochs = 8000)
model.compile('L-BFGS-B')
early_stopping = EarlyStopping(min_delta=1e-16, patience=200000)
model.train(epochs = 200000, callbacks=[early_stopping])
plot_Time = np.linspace(0.0, Time, 2)
N_nn = 500
X_nn = np.linspace(X_min, X_max, N_nn)
X_nn = np.reshape(X_nn,(len(X_nn),1))
fig_type = 0
for i,dummy in enumerate(plot_Time):
T_nn = np.ones((N_nn, 1)) * plot_Time[i]
Q_nn = np.hstack((X_nn, T_nn))
W_nn = model.predict(Q_nn)
Z_nn = W_nn[:,0] / scale_h
fig = plt.figure()
plt.plot(X_nn, Z_nn, 'r-', lw=3., label='NN Solution')
plt.legend(loc='best')
plt.title(r'T $= {:.2f}$ [s]'.format(plot_Time[i]))
plt.grid(True)
plt.show()
@lululxvi
It worked, the tweaks to the source code were crucial to extend the L-BFGS-B iterations as much as I wanted. I still could't get to a great result despite the mammoth 120 x 4 architecture. No problem. I'll settle for a solution that was pretty much acceptable around the shock.
I have no more question, I thank you for your valuable help.
@Ryszard2 Thank you
I have recently downloaded the latest version of DeepXDE and I can't find train.py in the deepxde folder. How can I change the options for L-BFGS-B?
Use dde.optimizers.set_LBFGS_options()
https://deepxde.readthedocs.io/en/latest/modules/deepxde.optimizers.html#module-deepxde.optimizers.config
@lululxvi
It worked, the tweaks to the source code were crucial to extend the L-BFGS-B iterations as much as I wanted. I still could't get to a great result despite the mammoth 120 x 4 architecture. No problem. I'll settle for a solution that was pretty much acceptable around the shock.
I have no more question, I thank you for your valuable help.
Thank you for your contribution! How far away was the final result from your exact solution? what was the order of the error value? I'm trying to model a wave through time and I'm running through a similar issue.
Just to make sure, do I use this line before or after compiling the LBFGS?
dde.optimizers.set_LBFGS_options(gtol=1e-12, maxiter=20000)
model.compile("L-BFGS-B")
or
model.compile("L-BFGS-B")
dde.optimizers.set_LBFGS_options(gtol=1e-12, maxiter=20000)
Hello @engsbk,
I'll try to explain what I've done and what I got:
120 x 4
is gone. I can't afford gigantic networks, plus I became an Adam person in the process of learning how to train more effectively the networks I use;T = 0
Height ---> L2 Relative = 5.463e-04
Velocity ---> L2 = 9.583e-05
Flow ---> L2 = 1.078e-09
T = 6
Height ---> L2 Relative = 3.035e-04
Velocity ---> L2 Relative = 8.727e-03
Flow ---> L2 Relative = 6.017e-03
I did it with a honest 60 x 4
net, for about 300,000 epochs, running just the Adam optimization solver.
data = dde.data.TimePDE(
geomtime, pde, IC_BC,
num_domain = 6000,
num_boundary = 10,
num_initial = 1000)
net = dde.maps.FNN(
layer_sizes = [dim_input] + [60]*4 + [dim_output],
activation = "tanh",
kernel_initializer = "Glorot uniform")
model = dde.Model(data, net)
model.compile('adam', lr=0.001, loss_weights=[1,1,1e4,5e4])
model.restore(my_path, verbose=1)
The results are decent but not great. In order to improve this model I'd recommend additional learning, say 300,000 epochs with Adam, this time with a smaller learning rate, say 1e-4 or 1e-5 or even smaller, depending on how well the convergence improves. That's just my opinion :)
@engsbk dde.optimizers.set_LBFGS_options
should be used at the beginning.
@Ryszard2 Very good results.
Hello @engsbk,
I'll try to explain what I've done and what I got:
- the L-BFGS-B experiment operated in that large
120 x 4
is gone. I can't afford gigantic networks, plus I became an Adam person in the process of learning how to train more effectively the networks I use;- I've later repeated that 1D SWE problem, looking for better accuracy, here's what I got and how I did it:
T = 0 Height ---> L2 Relative = 5.463e-04 Velocity ---> L2 = 9.583e-05 Flow ---> L2 = 1.078e-09
T = 6 Height ---> L2 Relative = 3.035e-04 Velocity ---> L2 Relative = 8.727e-03 Flow ---> L2 Relative = 6.017e-03
I did it with a honest
60 x 4
net, for about 300,000 epochs, running just the Adam optimization solver.data = dde.data.TimePDE( geomtime, pde, IC_BC, num_domain = 6000, num_boundary = 10, num_initial = 1000) net = dde.maps.FNN( layer_sizes = [dim_input] + [60]*4 + [dim_output], activation = "tanh", kernel_initializer = "Glorot uniform") model = dde.Model(data, net) model.compile('adam', lr=0.001, loss_weights=[1,1,1e4,5e4]) model.restore(my_path, verbose=1)
The results are decent but not great. In order to improve this model I'd recommend additional learning, say 300,000 epochs with Adam, this time with a smaller learning rate, say 1e-4 or 1e-5 or even smaller, depending on how well the convergence improves. That's just my opinion :)
Thanks a lot!!
I did not use the exact same approach you did, but I increased the IC weight and modified the gtol for L-BFGS, and I also increased the number of max iterations for L-BFGS. Results improved drastically!
For some reason it was the L-BFGS for me not Adam.
Loss for the Adam optimizer would get stuck at one value of error around 20,000 and would not perform any better even when I used 300,000 epochs.
I also have a comment here. I'm facing a problem with the IC I'm setting for my problem:
ic = dde.IC(geomtime, lambda x:0, lambda _, on_initial: on_initial)
I'm explicitly assigning it to zero, but when I try plotting it, these are my results:
I'm using a 32*4 FNN with 10,000 epochs for adam and 50,000 epochs for L-BFGS-B. I also assigned weights 1000 for the IC and 100 to BC. I tried multiple for loops to try different values of neurons, layers, epoch, point number, weights, and sampling distribution, but the initial time (t = 0) is never flat 0 in the prediction as it should be.
For plotting the above figures I used this code. Please let me know if I could be handling u_pred in the wrong way.
#2D slices through the predicted surface for comparison
mesh = 100
t_slices = 5
t_delta = t_end/t_slices #time jump to get snapshot
print("Time step: ", t_delta)
######################################Prepare mesh for comparison#######################################
xm, ym = np.meshgrid(x,y)
X, u_true = gen_testdata()
u_pred = model.predict(X)
u_pred_reshaped = u_pred.reshape(100,100,100)
#######################################Zoom Factor######################################################
Zoom_limit = 0.001 + exact[:,:,:].max().item() #To see the curve better. Smaller number, Bigger zoom
##############################################First Slice##############################################
fig = plt.figure(figsize=(600, 250), dpi=80)
fig.subplots_adjust(left=0.1, bottom=0.5, right=0.125, top=0.6, wspace=0.3, hspace=None)
print("Initial Time...")
ax = fig.add_subplot(t_slices,3, 1, projection='3d')
surf = ax.plot_surface(xm,ym, exact[:,:, 0].T, cmap=mpl.cm.gist_heat, label = 'Exact')
ax.plot_wireframe(xm,ym, u_pred_reshaped[:,:, 0], rstride=2, cstride=2,label = 'Prediction')
ax.view_init(45, 45)
ax.set_xlabel('x',fontsize = 15)
ax.set_ylabel('y',fontsize = 15)
ax.set_zlabel('u(x,y,t)');
ax.set_title('$t = %1.3f$'%(t[0]), fontsize = 15)
ax.set_xlim(d_start, d_end)
ax.set_ylim(d_start, d_end)
ax.set_zlim(-Zoom_limit, Zoom_limit)
###Observation Probes###
slice_at_x = 50
slice_at_y = 50
########################
#######################Slice Along x ##############################
ax = fig.add_subplot(t_slices,3, 2)
ax.set_xlabel('x',fontsize = 15)
ax.set_ylabel('u(x,y,t)',fontsize = 15)
ax.plot(x,exact[:,slice_at_y,0],'r-')
ax.plot(x,u_pred_reshaped[slice_at_y,:,0],'b--')
ax.set_xlim(d_start, d_end)
ax.set_ylim(-Zoom_limit, Zoom_limit)
#######################Slice Along y ##############################
ax = fig.add_subplot(t_slices,3, 3)
ax.set_xlabel('y',fontsize = 15)
ax.set_ylabel('u(x,y,t)')
ax.plot(y,exact[slice_at_x,:,0],'r-', label = 'Exact')
ax.plot(y,u_pred_reshaped[:,slice_at_x, 0],'b--', label = 'Prediction')
ax.set_xlim(d_start, d_end)
ax.set_ylim(-Zoom_limit, Zoom_limit)
ax.legend()
plt.show()
################################################Slices#################################################
for t_slice in range(t_slices):
fig = plt.figure(figsize=(600, 250), dpi=80)
fig.subplots_adjust(left=0.1, bottom=0.5, right=0.125, top=0.6,
wspace=0.3, hspace=None)
print("Slice #", t_slice)
if t_slice == t_slices:
at_index = int(mesh/(t_slices-t_slice))
else:
at_index = int(mesh/(t_slices-t_slice)) - 1
print("slicing at index ", at_index)
#############################################################################
########################Lay planes on each other#############################
ax = fig.add_subplot(t_slices,3, 1, projection='3d')
surf = ax.plot_surface(xm,ym, exact[:,:, at_index].T, cmap=mpl.cm.gist_heat, label = 'Exact')
ax.plot_wireframe(xm,ym, u_pred_reshaped[:,:, at_index], rstride=2, cstride=2,label = 'Prediction')
ax.view_init(45, 45)
ax.set_xlabel('x',fontsize = 15)
ax.set_ylabel('y',fontsize = 15)
ax.set_zlabel('u(x,y,t)');
ax.set_title('$t = %1.3f$'%(t[at_index]), fontsize = 15)
ax.set_xlim(d_start, d_end)
ax.set_ylim(d_start, d_end)
ax.set_zlim(-Zoom_limit, Zoom_limit)
print("L2 relative error:", dde.metrics.l2_relative_error(exact[:,:,at_index].T, u_pred_reshaped[:,:,at_index]))
###Observation Probes###
slice_at_x = 50
slice_at_y = 50
########################
#######################Slice Along x ##############################
ax = fig.add_subplot(t_slices,3, 2)
ax.set_xlabel('x',fontsize = 15)
ax.set_ylabel('u(x,y,t)',fontsize = 15)
ax.plot(x,exact[:,slice_at_y,at_index],'r-')
ax.plot(x,u_pred_reshaped[slice_at_y,:,at_index],'b--')
ax.set_xlim(d_start, d_end)
ax.set_ylim(-Zoom_limit, Zoom_limit)
#######################Slice Along y ##############################
ax = fig.add_subplot(t_slices,3, 3)
ax.set_xlabel('y',fontsize = 15)
ax.set_ylabel('u(x,y,t)')
ax.plot(y,exact[slice_at_x,:,at_index],'r-', label = 'Exact')
ax.plot(y,u_pred_reshaped[:,slice_at_x, at_index],'b--', label = 'Prediction')
ax.set_xlim(d_start, d_end)
ax.set_ylim(-Zoom_limit, Zoom_limit)
ax.legend()
plt.show()
I'm really looking forward for resolving this issue. Thanks everyone!
How about setting non-IC weight loss to be 0?
I'm trying this option now, and I will update you with the results, but wouldn't this ignore training the FNN for BC and PDE?
I have finished running the model for weights = [0, 0, 100] for [pde, bc, ic]. These are the results. The initial time (t = 0) is still not a flat plane.
I noticed that assigning:
num_domain= 0,
num_boundary= 300,
num_initial= 0
while keeping the weights [1, 10, 100] give the best results so far, like in the first plots I showed here. However, the problem in the at t=0 is still there.
I also ran it again for [0,0,1] instead of [0,0,100] just to make sure if the higher weight is affecting negatively or positively, and these were the results:
num_initial= 0
means there is no training points for IC. You need to set non zero num_initial
The reason why I chose num_initial = 0
is because the Sobol distribution gave me bad results, so I'm using these observation points instead which gave so much better results except for the initial time plot.
##########################Observe Points############################
#############################Element1###############################
x_obs = 0.5
y_obs = 2.5
el_size = 0.2 #size, I think supposed to be radius r
Point_den = 50
phase = np.linspace(0, 2, num=Point_den)/2
x_source1 = np.sin(2*np.pi*phase)*el_size
x_source1 = np.tile(x_source1, Point_den) + x_obs
y_source1 = np.cos(2*np.pi*phase)
y_source1 = np.tile(y_source1, Point_den)*el_size + y_obs
t_source1 = np.linspace(0, t_end, num=Point_den)
t_source1 = np.repeat(t_source1, Point_den, axis=0)
#Also usable for multiple Elements
obs_x = np.concatenate((x_source1),axis = None)
obs_y = np.concatenate((y_source1),axis = None)
obs_t = np.concatenate((t_source1),axis = None)
OBS = np.vstack((obs_x, obs_y, obs_t)).T
################################################################################
obs_circles = 60 #number of circles to observe during prediction
x_obs = 0.5
y_obs = 2.5
Point_den = 50
phase = np.linspace(0, 2, num=Point_den)/2
#Create circle domain sections
circle_radii = np.ones(obs_circles)
for obs_circle in range(obs_circles):
circle_radii[obs_circle] = (d_end/obs_circles)*obs_circle
print ("circle radii:", circle_radii)
for circle_radius in circle_radii:
x_source1 = np.sin(2*np.pi*phase)*circle_radius
x_source1 = np.tile(x_source1, Point_den) + x_obs
y_source1 = np.cos(2*np.pi*phase)
y_source1 = np.tile(y_source1, Point_den)*circle_radius + y_obs
t_source1 = np.linspace(0, t_end, num=Point_den)
t_source1 = np.repeat(t_source1, Point_den, axis=0)
obs_x = np.concatenate((obs_x, x_source1),axis = None)
obs_y = np.concatenate((obs_y, y_source1),axis = None)
obs_t = np.concatenate((obs_t, t_source1),axis = None)
OBS = np.vstack((obs_x, obs_y, obs_t)).T
This code is supposed to sample the domain in a circular fashion:
I tried to animate it, so I can try and understand the problem. The waves should propagate forward, however, they start in reverse and then start going in the right direction. I think this is also related to the initial prediction.
Do you think the value 0 as an initial condition is causing the problem? I'm also considering using a different optimizer instead of adam optimizer. Is it possible to use it with DeepXDE?
Yes, other optimizers are supported. Or, you may consider hard constraint of IC.
Update: The initial time is looking a lot better when I used lr=1e-6
, however, the rest of the time slices can still be improved I think. I did not change the optimizer yet, nor did I use hard constraints.
Would you recommend changing the optimizer or simply keep trying other learning rates?
I suggest using hard constraint of IC.
Thanks @lululxvi for your reply. I will test using hard constraints for IC. Also, how can I add this optimizer to the options of optimizers supported in DeepXDE: https://github.com/juntang-zhuang/Adabelief-Optimizer ?
I think it will be a good idea to test with it as well.
Modify the code at https://github.com/lululxvi/deepxde/tree/master/deepxde/optimizers
Thanks! I started modifying it. I added:
"adabelief": tfa.optimizers.AdaBelief(lr),
And I got this error:
37 update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
38 with tf.control_dependencies(update_ops):
---> 39 train_op = _get_optimizer(optimizer, lr).minimize(loss, global_step=global_step)
TypeError: minimize() got an unexpected keyword argument 'global_step'
So I used: global_step
instead of global_step=global_step
Because according to the definition of minimize()
in AdaBelief, it does not have this variable. Referenced from: https://www.tensorflow.org/addons/api_docs/python/tfa/optimizers/AdaBelief?hl=cs#minimize
but then I received this error:
ValueError: `tape` is required when a `Tensor` loss is passed. Received: loss=Tensor("Sum:0", shape=(), dtype=float32), tape=None.
While reading about the error, it seems like the tape required here is the tape that computed the loss, but I don't know what it is. It would be great if we worked together to add this optimizer to the DeepXDE library. It has a reputable performance.
Thanks again @lululxvi !
Which backend do you use? TFv1 or v2?
I'm using TF 2.7 but the beginning of my code says:
Using backend: tensorflow.compat.v1
Thanks @lululxvi ! The problem was not with the backend. I got it working from a different library https://github.com/juntang-zhuang/Adabelief-Optimizer instead of using the one in the TF add-ons https://www.tensorflow.org/addons/api_docs/python/tfa/optimizers/AdaBelief?hl=cs#args. Will share the results with you once it finishes running successfully.
Great. You may also send a PR.
Thank you for the valuable recommendations! Sure. I have a couple of observations:
WARNING:tensorflow:AutoGraph could not transform <function pde at 0x00000266BB5FF4C0> and will run it as-is.
Cause: Unable to locate the source code of <function pde at 0x00000266BB5FF4C0>. Note that functions defined in certain environments, like the interactive Python shell, do not expose their source code. If that is the case, you should define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.experimental.do_not_convert. Original error: could not get source code
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING: AutoGraph could not transform <function pde at 0x00000266BB5FF4C0> and will run it as-is.
Cause: Unable to locate the source code of <function pde at 0x00000266BB5FF4C0>. Note that functions defined in certain environments, like the interactive Python shell, do not expose their source code. If that is the case, you should define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.experimental.do_not_convert. Original error: could not get source code
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING:tensorflow:AutoGraph could not transform <function
(lambda x, on: np.array([on_boundary(x[i], on[i]) for i in range(len(x))]))
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING: AutoGraph could not transform <function
(lambda x, on: np.array([on_boundary(x[i], on[i]) for i in range(len(x))]))
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING:tensorflow:AutoGraph could not transform <function
(lambda x, on: np.array([on_boundary(x[i], on[i]) for i in range(len(x))]))
- Also, I have not figured out why I could not run AdaBelief through `tensorflow-addons`, but it ran without errors when I used `adabelief-tf `library.
Which backend did you use? TF v1 or v2? tensorflow-addons
should work for TF v2.
I was using TF v2.7 with "tensorflow" backend.
What is the error of tensorflow-addons
?
"adabelief": tfa.optimizers.AdaBelief(lr),
And I got this error:
37 update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) 38 with tf.control_dependencies(update_ops): ---> 39 train_op = _get_optimizer(optimizer, lr).minimize(loss, global_step=global_step) TypeError: minimize() got an unexpected keyword argument 'global_step'
So I used:
global_step
instead ofglobal_step=global_step
Because according to the definition of
minimize()
in AdaBelief, it does not have this variable. Referenced from: https://www.tensorflow.org/addons/api_docs/python/tfa/optimizers/AdaBelief?hl=cs#minimizebut then I received this error:
ValueError: `tape` is required when a `Tensor` loss is passed. Received: loss=Tensor("Sum:0", shape=(), dtype=float32), tape=None.
While reading about the error, it seems like the tape required here is the tape that computed the loss, but I don't know what it is. It would be great if we worked together to add this optimizer to the DeepXDE library. It has a reputable performance.
This was the error I got.
Use TF v2 backend.
Yes, I'm already using it.
but I still get this error when trying to use tensorflow_addons
to import AdaBelief
Compiling model...
'compile' took 0.000419 s
Training model...
WARNING:tensorflow:AutoGraph could not transform <function pde at 0x000002050A991940> and will run it as-is.
Cause: Unable to locate the source code of <function pde at 0x000002050A991940>. Note that functions defined in certain environments, like the interactive Python shell, do not expose their source code. If that is the case, you should define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.experimental.do_not_convert. Original error: could not get source code
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING: AutoGraph could not transform <function pde at 0x000002050A991940> and will run it as-is.
Cause: Unable to locate the source code of <function pde at 0x000002050A991940>. Note that functions defined in certain environments, like the interactive Python shell, do not expose their source code. If that is the case, you should define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.experimental.do_not_convert. Original error: could not get source code
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING:tensorflow:AutoGraph could not transform <function <lambda> at 0x0000020563FEF280> and will run it as-is.
Cause: could not parse the source code of <function <lambda> at 0x0000020563FEF280>: no matching AST found among candidates:
# coding=utf-8
(lambda x, on: np.array([on_boundary(x[i], on[i]) for i in range(len(x))]))
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING: AutoGraph could not transform <function <lambda> at 0x0000020563FEF280> and will run it as-is.
Cause: could not parse the source code of <function <lambda> at 0x0000020563FEF280>: no matching AST found among candidates:
# coding=utf-8
(lambda x, on: np.array([on_boundary(x[i], on[i]) for i in range(len(x))]))
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING:tensorflow:AutoGraph could not transform <function <lambda> at 0x000002050001C310> and will run it as-is.
Cause: could not parse the source code of <function <lambda> at 0x000002050001C310>: no matching AST found among candidates:
# coding=utf-8
(lambda x, on: np.array([on_boundary(x[i], on[i]) for i in range(len(x))]))
# coding=utf-8
(lambda x, on: np.array([on_boundary(x[i], on[i]) for i in range(len(x))]))
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
Step Train loss Test loss Test metric
0 [3.16e-02, 3.37e-01, 2.37e-01] [3.16e-02, 3.37e-01, 2.37e-01] []
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<timed exec> in <module>
C:\ProgramData\Anaconda3\envs\deepxde_adabelief\lib\site-packages\deepxde\utils\internal.py in wrapper(*args, **kwargs)
20 def wrapper(*args, **kwargs):
21 ts = timeit.default_timer()
---> 22 result = f(*args, **kwargs)
23 te = timeit.default_timer()
24 print("%r took %f s\n" % (f.__name__, te - ts))
C:\ProgramData\Anaconda3\envs\deepxde_adabelief\lib\site-packages\deepxde\model.py in train(self, epochs, batch_size, display_every, disregard_previous_best, callbacks, model_restore_path, model_save_path)
359 if epochs is None:
360 raise ValueError("No epochs for {}.".format(self.opt_name))
--> 361 self._train_sgd(epochs, display_every)
362 self.callbacks.on_train_end()
363
C:\ProgramData\Anaconda3\envs\deepxde_adabelief\lib\site-packages\deepxde\model.py in _train_sgd(self, epochs, display_every)
376 *self.data.train_next_batch(self.batch_size)
377 )
--> 378 self._train_step(
379 self.train_state.X_train,
380 self.train_state.y_train,
C:\ProgramData\Anaconda3\envs\deepxde_adabelief\lib\site-packages\deepxde\model.py in _train_step(self, inputs, targets, auxiliary_vars)
291 self.sess.run(self.train_step, feed_dict=feed_dict)
292 elif backend_name == "tensorflow":
--> 293 self.train_step(inputs, targets, auxiliary_vars)
294 elif backend_name == "pytorch":
295 # TODO: auxiliary_vars
~\AppData\Roaming\Python\Python38\site-packages\tensorflow\python\util\traceback_utils.py in error_handler(*args, **kwargs)
151 except Exception as e:
152 filtered_tb = _process_traceback_frames(e.__traceback__)
--> 153 raise e.with_traceback(filtered_tb) from None
154 finally:
155 del filtered_tb
~\AppData\Roaming\Python\Python38\site-packages\tensorflow\python\framework\func_graph.py in autograph_handler(*args, **kwargs)
1127 except Exception as e: # pylint:disable=broad-except
1128 if hasattr(e, "ag_error_metadata"):
-> 1129 raise e.ag_error_metadata.to_exception(e)
1130 else:
1131 raise
AttributeError: in user code:
File "C:\ProgramData\Anaconda3\envs\deepxde_adabelief\lib\site-packages\deepxde\model.py", line 190, in train_step *
opt.apply_gradients(zip(grads, trainable_variables))
AttributeError: 'tuple' object has no attribute 'apply_gradients'
However, I can import and run AdaBelief when using the other library from adabelief_tf import AdaBeliefOptimizer
I prefer tensorflow-addons over adabelief_tf, because tensorflow-addons is well maintained and keeps updated, but adabelief_tf has stopped updating since Dec 2020. How about you open a PR for the modified code of tensorflow-addons? I will look at the code there and figure out the issue.
I've made the pull requests. I have modified the optimizers.py code for both TFv1 and TFv2 backends.
I suggest using hard constraint of IC.
I used this hard constraint for the IC:
net.apply_output_transform(lambda x, u: u * (1 - tf.exp(-x[:,1:2])))
Unfortunately, my results did not improve:
Please advise.
My general suggestion is that you start with a similar but simpler problem (smaller domain, shorter time, low-frequency solution, etc); after you have solved the simple one, then move to the current one.
The training loss is so small, but the solution is not correct. If the code is correct, then one possible reason is that your solution is O(0.001). Try to rescale your problem such that it is O(1).
@lululxvi Sorry to disturb you. if my solution is O(0.1), How can I rescale my problem to O(1)?
net.apply_output_transform(lambda x, y: y * c)
is available? and should c be 0.1 or 10? Looking forward to your reply, thank you!
See FAQ.
Hi @Ryszard2 :), I have been experimenting something similar with shallow water equations. However, the accuracy is not as good as your results in the last update. It is rather something similar to when you first applied the scaling of PDE. Can I ask what changes led you to get significantly better results in the end? Thanks in advance.
Hello Riccardo!@Ryszard2 I am a doctoral student from South China University of Technology in China. My research focus is urban flood modeling, I am currently very interested in the application of PINN in shallow water equations. I came across your master's thesis online and it has been very helpful to me. However, unfortunately, as I am just getting started with PINN, I am not familiar with these techniques and unable to implement your code. My request is if you could kindly send me the code from your master's thesis? Thank you very much! My email: 415973548@qq.com. Or may u share your code repository to me?Very grateful for that.
Hello Lu Lu, DeepXDE is great and I am trying to use it to deal with a complex Dam Break problem, that is:
The 2D Shallow Water system I wrote is a pure hyperbolic problem without any sort of laplacian contribution or viscosity.
I set up the training points only with the anchors, so that I can control the interval in every dimension:
that results in 150,801 training points.
The NN is a
2 + [40]*8 + 2
architecture trained with adam and 10000 epochs. Then it takes only 13 steps with L-BFGS-B and I get the message:INFO:tensorflow:Optimization terminated with:
Message: b'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL'
Objective function value: 0.000000
Number of iterations: 11
Number of functions evaluations: 13
The losses at step 10013 are:
more precisely:
[1.03e-08, 4.94e-10, 4.20e-07, 1.98e-09, 9.10e-10, 2.34e-09, 9.91e-10]
Even though I used a lot of training points and a very deep NN, to obtain a solution that fits the analytic one seems pretty much out of reach for me, so far:
I tried to add a laplacian with a little bit of viscosity in the momentum equation but I could’t get to a better result.
I noticed that applying the hard constraint for the IC of a Riemann problem does not work because the initial discontinuity does not move from the initial point. Essentially, I can’t get the Riemann problem to an evolution if I hard constrain the IC.
What is wrong in my approach? What should be tweaked?
Thank you, Riccardo