Closed mrtgocer closed 3 years ago
Yes, the library can solve nonlinear ODEs (and PDEs). You will need to change the architecture for the problem that you are trying to solve.
Can you show us the loss function that you are getting when you train the network?
Also, what architecture are you using? (How many layers, nodes per layer, what activation function, etc.)
I can share an example of the codes and architecture I've tried, but I can say that I tried all the activation functions in the Pytorch library, also a lot of layers and nodes per layer. In the documentation, I couldn't see any solved nonlinear ODE example. So I thought there are maybe different approaches for the nonlinear equations such as implementation the equations after making them linear using different algorithms or maybe solving them by hand or any other way.
By the way, I'm sharing the codes and maybe I couldn't understand the library clearly. At the end of the code, I also shared both ANN solution of the equation and Analytical solution of the equation. Thank you very much.
ric_eq = lambda x, t : [diff(x, t) - (2x - xx +1)]
init_vals_rc = [IVP(t_0=0.0, x_0=0.0)]
nets_rc = [ FCNN(n_hidden_units=32, n_hidden_layers=2, actv=nn.Tanh), FCNN(n_hidden_units=32, n_hidden_layers=3, actv=nn.Tanh), FCNN(n_hidden_units=32, n_hidden_layers=1, actv=nn.Tanh) ]
solutionrc, = solve_system( ode_system=ric_eq, conditions=init_vals_rc, t_min=0.0, t_max=4, nets=nets_rc, max_epochs=12000)
ts = np.linspace(0, 4.0, 40) x_net = solution_rc(ts, as_type='np') x_ana = 1+(np.sqrt(2)np.tanh((np.sqrt(2)(ts))+(np.log((-1+np.sqrt(2))/(1+np.sqrt(2))))/2))
plt.figure() plt.plot(ts, x_net, label='ANN-based solution') plt.plot(ts, x_ana, label='analytical solution') plt.ylabel('x') plt.xlabel('t') plt.title('comparing solutions') plt.legend() plt.show()
plt.figure() plt.plot(loss_ex['train_loss'], label='training loss') plt.plot(loss_ex['valid_loss'], label='validation loss') plt.yscale('log') plt.title('loss during training') plt.legend() plt.show()
solution_rc(ts,as_type='np')
ANN SOLUTION array([ 0. , -0.13099788, -0.19428332, -0.25069585, -0.2893149 , -0.32184356, -0.3470278 , -0.36557654, -0.3789567 , -0.38858193, -0.39556387, -0.40069732, -0.40452737, -0.4074253 , -0.4096469 , -0.4113718 , -0.41272786, -0.41380626, -0.4146721 , -0.415372 , -0.4159388 , -0.41639596, -0.41676056, -0.41704527, -0.4172596 , -0.41741136, -0.4175068 , -0.41755173, -0.417551 , -0.41750926, -0.41743094, -0.41731992, -0.41718 , -0.41701493, -0.41682786, -0.4166221 , -0.41640043, -0.41616547, -0.41591984, -0.41566575], dtype=float32)
ANALYTICAL SOLUTION {0., 0.110295, 0.241977, 0.395105, 0.567812, 0.756014, 0.953566, 1.15295, 1.34636, 1.52691, 1.6895, 1.83124, 1.95136, 2.05074, 2.13133, 2.19563, 2.24629, 2.28578, 2.31632, 2.33981, 2.35777, 2.37147, 2.38188, 2.38977, 2.39576, 2.40028, 2.4037, 2.40628, 2.40823, 2.4097, 2.41081, 2.41165, 2.41228, 2.41276, 2.41312, 2.41339, 2.41359, 2.41374, 2.41386, 2.41395, 2.41401}
Thanks for the code. There is only one unknown function here so one FCNN
is enough. Still, I encounter the same error as you do and the solution is quite wrong. I've never tried any ODE that is quadratic in the unknown function before, and I think this method may indeed be unprepared for these ODEs.
The bug can be reproduced by:
from neurodiffeq import diff
from neurodiffeq.networks import FCNN
from neurodiffeq.ode import solve, IVP, Monitor, ExampleGenerator
import torch
from torch import nn, optim
import numpy as np
import matplotlib.pyplot as plt
def ric_eq(x, t):
return diff(x, t) - (2*x - x**2 +1)
fcnn_rc = FCNN(n_hidden_units=32, n_hidden_layers=1, actv=nn.Tanh)
adam = optim.Adam(fcnn_rc.parameters(), lr=0.001)
init_vals_rc = IVP(t_0=0.0, x_0=0.0)
train_gen = ExampleGenerator(32, t_min=0.0, t_max=4.0, method="equally-spaced-noisy")
solution_rc, _ = solve(
ode=ric_eq,
condition=init_vals_rc,
train_generator=train_gen,
t_min=0.0, t_max=4,
net=fcnn_rc,
batch_size=32,
max_epochs=20000,
monitor=Monitor(t_min=0.0, t_max=4.0, check_every=100),
)
# analytical solution
ts = np.linspace(0, 4.0, 40)
x_ana = 1+(np.sqrt(2)*np.tanh((np.sqrt(2)*(ts))+(np.log((-1+np.sqrt(2))/(1+np.sqrt(2))))/2))
plt.plot(ts, x_ana)
We have not solved nonlinear Riccati equations with this method before, but we have solved difficult nonlinear ODEs before (Clairaut equation and Bratu problem). The method has also had success solving nonlinear PDEs (Navier Stokes). It does require some tweaking to get the architecture right. I am bringing @marco-digio into this conversation because he solved Clairaut and Bratu and he may have some insights about how to solve Riccati equations well.
The method solves Nonlinear ODEs. This is the code to solve Riccati equation:
from neurodiffeq import diff
from neurodiffeq.networks import FCNN
from neurodiffeq.ode import solve, IVP, Monitor, ExampleGenerator
import torch
from torch import nn, optim
import numpy as np
import matplotlib.pyplot as plt
def ric_eq(x, t):
return diff(x, t) - (2*x - x**2 +1)
t_min, t_max = 0.0, 2.0
N = 32
fcnn_rc = FCNN(n_hidden_units=32, n_hidden_layers=1, actv=nn.Tanh)
adam = optim.Adam(fcnn_rc.parameters(), lr=0.001)
init_vals_rc = IVP(t_0= t_min, x_0=0.0)
train_gen = ExampleGenerator(N, t_min= t_min, t_max= t_max, method="equally-spaced-noisy")
solution_rc, _ = solve(
ode=ric_eq,
condition=init_vals_rc,
train_generator=train_gen,
t_min=t_min, t_max=t_max,
net=fcnn_rc,
batch_size=N,
max_epochs=20000,
optimizer=adam,
monitor=Monitor(t_min= t_min, t_max= t_max, check_every=100),
)
# analytical solution compared to found solution
ts = np.linspace(t_min, t_max, 40)
x_ana = 1+(np.sqrt(2)*np.tanh((np.sqrt(2)*(ts)+(np.log((-1+np.sqrt(2))/(1+np.sqrt(2))))/2)))
x_nn = solution_rc(ts, as_type='np')
plt.plot(ts, x_ana)
plt.plot(ts, x_nn)
plt.show()
# MSE
print(np.mean((x_ana-x_nn)**2))
Please check if you can make this work.
You can see that I've decreased t_max from 4.0 to 2.0. We have noticed that the training converges to a local min when the range [t_min, t_max] is too big, as you pointed out. I can suggest to try with more training points and probably the optimizer needs to be tuned properly. We will continue to investigate too.
@marco-digio I have run the codes you specified, all NN solutions are still negative while Analytics are positive but, as you said when I kept the range quite short the results were better. When the range is 4 (t_min=0, t_max=4) the MSE was 5.918 and the range is 1 (t_min=0, t_max=1) the MSE was 2.206.. But I still need more accurate results.. I'll try to solve other kinds of nonlinear equations and if I find new bugs I'll let you know. Thank you all for your time and interest.
@mrtgocer It is weird, I attach a picture of my result. As you can see in left figure, running the code above in range [0.0, 2.0], the solution is positive. The behaviour of the loss (right figure) is strange, it is really flat and suddently goes down when it is able to exit the local min (maybe try to increase max_epochs
). I get an MSE of 6*10^{-7}
This is not a bug, but a challenge for the method near fixed points of a dynamical system. This difficulty can be offset somewhat by applying curriculum learning in which a curriculum for sampling the training points is supplied during training. At this time, the software package does not offer this option. It may be fairly straightforward to do this and we will certainly try to find a way to handle it.
This is a way to implement curriculum learning. Firstly train the network from 0.0 to 1.0, then in the full interval from 0.0 to 4.0. I get an MSE of $2*10^{-8}$
from neurodiffeq import diff
from neurodiffeq.networks import FCNN
from neurodiffeq.ode import solve, IVP, Monitor, ExampleGenerator
import torch
from torch import nn, optim
import numpy as np
import matplotlib.pyplot as plt
def ric_eq(x, t):
return diff(x, t) - (2*x - x**2 +1)
t_min, t_max = 0.0, 1.0
N = 32
fcnn_rc = FCNN(n_hidden_units=32, n_hidden_layers=1, actv=nn.Tanh)
adam = optim.Adam(fcnn_rc.parameters(), lr=0.001)
init_vals_rc = IVP(t_0= t_min, x_0=0.0)
train_gen = ExampleGenerator(N, t_min= t_min, t_max= t_max, method="equally-spaced-noisy")
solution_rc, _ = solve(
ode=ric_eq,
condition=init_vals_rc,
train_generator=train_gen,
t_min=t_min, t_max=t_max,
net=fcnn_rc,
batch_size=N,
max_epochs=1500,
optimizer=adam,
monitor=Monitor(t_min= t_min, t_max= t_max, check_every=100),
)
t_max2 = 4.0
N2 = 128
adam2 = optim.Adam(fcnn_rc.parameters(), lr=0.001)
train_gen2 = ExampleGenerator(N2, t_min= t_min, t_max= t_max2, method="equally-spaced-noisy")
solution_rc, _ = solve(
ode=ric_eq,
condition=init_vals_rc,
train_generator=train_gen2,
t_min=t_min, t_max=t_max2,
net=fcnn_rc,
batch_size=N2,
max_epochs=5000,
optimizer=adam2,
monitor=Monitor(t_min= t_min, t_max= t_max2, check_every=100),
)
# analytical solution compared to found solution
ts = np.linspace(t_min, t_max2, 40)
x_ana = 1+(np.sqrt(2)*np.tanh((np.sqrt(2)*(ts)+(np.log((-1+np.sqrt(2))/(1+np.sqrt(2))))/2)))
x_nn = solution_rc(ts, as_type='np')
plt.plot(ts, x_ana)
plt.plot(ts, x_nn)
plt.show()
# MSE
print(np.mean((x_ana-x_nn)**2))
@marco-digio Can you show the loss and solution?
Yes:
Solution and loss after the first training
Solution and loss at the end of the training
Comparison of final solution and analytical one
@marco-digio yeah it works perfectly now. In the first train, the model learns wrong if you keep the range long (t_min=0, t_max=3), and it grows wrong as this range increases. But if you find accurate results in a small range (t_min=0, t_max=1), the model works very well in the real range! Again thank you all for your time, interest, and for the library!
I want to fix initial weights as my choice not randomly. what will be the code for it?
In general, you can create a network, which is a torch.nn.Module
, and use the method here to initialize weights for specific layers.
In addition, torch.nn.Module
support a load_state_dict
method, as documented here.
After initializing the network, simply pass it as a parameter to the solve
(or solve_system
, etc.) function, e.g., for ODE case
from neurodiffeq import diff
from neurodiffeq.ode import solve, Monitor
from neurodiffeq.conditions import IVP
from neurodiffeq.networks import FCNN
ode = lambda u, t: diff(u,t) + u
condition = IVP(0, 1)
net = FCNN(n_input_units=1, n_output_units=1, hidden_units=(32, 32))
net.load_state_dict(YOUR_STATE_DICT)
t_min, t_max = 0, 1
monitor = Monitor(t_min, t_max, check_every=100)
solution, loss_history = solve(
ode=ode,
condition=condition,
t_min=t_min,
t_max=t_max,
net=net,
monitor=monitor,
max_epochs=1000
)
it may be a silly question to ask. if we don't know the analytic solution of ode then how we compare the analytic solution with ann solution? because we cant solve many differential equations.
it may be a silly question to ask. if we don't know the analytic solution of ode then how we compare the analytic solution with ann solution? because we cant solve many differential equations.
You are right. In general, it's still an open topic of research. It's true that many differential equations don't have analytical solutions. The way we train the network is by minimizing the loss function, which equals sum of PDE residuals (squared) on a set of randomly chosen points.
However, a lower loss doesn't imply a closer approximation to the real solution. The only interpretation we have for this loss function is that it should be 0.
Instead of an analytical solution, I would recommend trying to find a numerical solution using a traditional method (FDM, FEM, FVM, spectral methods, etc.). The reason is that the upper bounds of approximation error are usually known for these methods. We can discuss more if you specify what equation you are trying to solve . In the meantime, You might also find this paper interesting, which proposed a weighting function for the loss terms. Nonetheless, the choice of the loss function is still empirical.
This is a way to implement curriculum learning. Firstly train the network from 0.0 to 1.0, then in the full interval from 0.0 to 4.0. I get an MSE of $2*10^{-8}$
from neurodiffeq import diff from neurodiffeq.networks import FCNN from neurodiffeq.ode import solve, IVP, Monitor, ExampleGenerator import torch from torch import nn, optim import numpy as np import matplotlib.pyplot as plt def ric_eq(x, t): return diff(x, t) - (2*x - x**2 +1) t_min, t_max = 0.0, 1.0 N = 32 fcnn_rc = FCNN(n_hidden_units=32, n_hidden_layers=1, actv=nn.Tanh) adam = optim.Adam(fcnn_rc.parameters(), lr=0.001) init_vals_rc = IVP(t_0= t_min, x_0=0.0) train_gen = ExampleGenerator(N, t_min= t_min, t_max= t_max, method="equally-spaced-noisy") solution_rc, _ = solve( ode=ric_eq, condition=init_vals_rc, train_generator=train_gen, t_min=t_min, t_max=t_max, net=fcnn_rc, batch_size=N, max_epochs=1500, optimizer=adam, monitor=Monitor(t_min= t_min, t_max= t_max, check_every=100), ) t_max2 = 4.0 N2 = 128 adam2 = optim.Adam(fcnn_rc.parameters(), lr=0.001) train_gen2 = ExampleGenerator(N2, t_min= t_min, t_max= t_max2, method="equally-spaced-noisy") solution_rc, _ = solve( ode=ric_eq, condition=init_vals_rc, train_generator=train_gen2, t_min=t_min, t_max=t_max2, net=fcnn_rc, batch_size=N2, max_epochs=5000, optimizer=adam2, monitor=Monitor(t_min= t_min, t_max= t_max2, check_every=100), ) # analytical solution compared to found solution ts = np.linspace(t_min, t_max2, 40) x_ana = 1+(np.sqrt(2)*np.tanh((np.sqrt(2)*(ts)+(np.log((-1+np.sqrt(2))/(1+np.sqrt(2))))/2))) x_nn = solution_rc(ts, as_type='np') plt.plot(ts, x_ana) plt.plot(ts, x_nn) plt.show() # MSE print(np.mean((x_ana-x_nn)**2))
This is a way to implement curriculum learning. Firstly train the network from 0.0 to 1.0, then in the full interval from 0.0 to 4.0. I get an MSE of $2*10^{-8}$
from neurodiffeq import diff from neurodiffeq.networks import FCNN from neurodiffeq.ode import solve, IVP, Monitor, ExampleGenerator import torch from torch import nn, optim import numpy as np import matplotlib.pyplot as plt def ric_eq(x, t): return diff(x, t) - (2*x - x**2 +1) t_min, t_max = 0.0, 1.0 N = 32 fcnn_rc = FCNN(n_hidden_units=32, n_hidden_layers=1, actv=nn.Tanh) adam = optim.Adam(fcnn_rc.parameters(), lr=0.001) init_vals_rc = IVP(t_0= t_min, x_0=0.0) train_gen = ExampleGenerator(N, t_min= t_min, t_max= t_max, method="equally-spaced-noisy") solution_rc, _ = solve( ode=ric_eq, condition=init_vals_rc, train_generator=train_gen, t_min=t_min, t_max=t_max, net=fcnn_rc, batch_size=N, max_epochs=1500, optimizer=adam, monitor=Monitor(t_min= t_min, t_max= t_max, check_every=100), ) t_max2 = 4.0 N2 = 128 adam2 = optim.Adam(fcnn_rc.parameters(), lr=0.001) train_gen2 = ExampleGenerator(N2, t_min= t_min, t_max= t_max2, method="equally-spaced-noisy") solution_rc, _ = solve( ode=ric_eq, condition=init_vals_rc, train_generator=train_gen2, t_min=t_min, t_max=t_max2, net=fcnn_rc, batch_size=N2, max_epochs=5000, optimizer=adam2, monitor=Monitor(t_min= t_min, t_max= t_max2, check_every=100), ) # analytical solution compared to found solution ts = np.linspace(t_min, t_max2, 40) x_ana = 1+(np.sqrt(2)*np.tanh((np.sqrt(2)*(ts)+(np.log((-1+np.sqrt(2))/(1+np.sqrt(2))))/2))) x_nn = solution_rc(ts, as_type='np') plt.plot(ts, x_ana) plt.plot(ts, x_nn) plt.show() # MSE print(np.mean((x_ana-x_nn)**2))
what is the meaning of N2 here ?? how many hidden layers here? one or two?
N2
is the number of training points used for each epoch.
Due to a wrong decision a long time ago, FCNN(n_hidden_layers=1)
will actually give you 2 hidden layers. We have now deprecated n_hidden_units
and n_hidden_layers
arguments in favor of hidden_units
.
So, instead of
fcnn_rc = FCNN(n_hidden_units=32, n_hidden_layers=1, actv=nn.Tanh)
you can use
fcnn_rc = FCNN(hidden_units=(32, 32), actv=nn.Tanh)
to achieve the same result (2 hidden layers with 32 units each).
If you only want 1 hidden layer, use this instead
fcnn_rc = FCNN(hidden_units=(32,), actv=nn.Tanh)
This is a way to implement curriculum learning. Firstly train the network from 0.0 to 1.0, then in the full interval from 0.0 to 4.0. I get an MSE of $2*10^{-8}$
from neurodiffeq import diff from neurodiffeq.networks import FCNN from neurodiffeq.ode import solve, IVP, Monitor, ExampleGenerator import torch from torch import nn, optim import numpy as np import matplotlib.pyplot as plt def ric_eq(x, t): return diff(x, t) - (2*x - x**2 +1) t_min, t_max = 0.0, 1.0 N = 32 fcnn_rc = FCNN(n_hidden_units=32, n_hidden_layers=1, actv=nn.Tanh) adam = optim.Adam(fcnn_rc.parameters(), lr=0.001) init_vals_rc = IVP(t_0= t_min, x_0=0.0) train_gen = ExampleGenerator(N, t_min= t_min, t_max= t_max, method="equally-spaced-noisy") solution_rc, _ = solve( ode=ric_eq, condition=init_vals_rc, train_generator=train_gen, t_min=t_min, t_max=t_max, net=fcnn_rc, batch_size=N, max_epochs=1500, optimizer=adam, monitor=Monitor(t_min= t_min, t_max= t_max, check_every=100), ) t_max2 = 4.0 N2 = 128 adam2 = optim.Adam(fcnn_rc.parameters(), lr=0.001) train_gen2 = ExampleGenerator(N2, t_min= t_min, t_max= t_max2, method="equally-spaced-noisy") solution_rc, _ = solve( ode=ric_eq, condition=init_vals_rc, train_generator=train_gen2, t_min=t_min, t_max=t_max2, net=fcnn_rc, batch_size=N2, max_epochs=5000, optimizer=adam2, monitor=Monitor(t_min= t_min, t_max= t_max2, check_every=100), ) # analytical solution compared to found solution ts = np.linspace(t_min, t_max2, 40) x_ana = 1+(np.sqrt(2)*np.tanh((np.sqrt(2)*(ts)+(np.log((-1+np.sqrt(2))/(1+np.sqrt(2))))/2))) x_nn = solution_rc(ts, as_type='np') plt.plot(ts, x_ana) plt.plot(ts, x_nn) plt.show() # MSE print(np.mean((x_ana-x_nn)**2))
as mentioned here is it curriculam learning ??? actually i read about curriculam learning . but never implemented on coding. just want to confirm. and ty so much Dear Liu for your clarification.
Curriculum learning is the process where you train the network on some domain (interval in this case), and gradually expand the domain as training progresses.
In this code snippet, the solution is first trained on (0.0, 1.0) for 1500 epochs and then trained on (0.0, 4.0) for another 5000 epochs. So, yes, this is an example of curriculum learning.
This is a way to implement curriculum learning. Firstly train the network from 0.0 to 1.0, then in the full interval from 0.0 to 4.0. I get an MSE of $2*10^{-8}$
from neurodiffeq import diff from neurodiffeq.networks import FCNN from neurodiffeq.ode import solve, IVP, Monitor, ExampleGenerator import torch from torch import nn, optim import numpy as np import matplotlib.pyplot as plt def ric_eq(x, t): return diff(x, t) - (2*x - x**2 +1) t_min, t_max = 0.0, 1.0 N = 32 fcnn_rc = FCNN(n_hidden_units=32, n_hidden_layers=1, actv=nn.Tanh) adam = optim.Adam(fcnn_rc.parameters(), lr=0.001) init_vals_rc = IVP(t_0= t_min, x_0=0.0) train_gen = ExampleGenerator(N, t_min= t_min, t_max= t_max, method="equally-spaced-noisy") solution_rc, _ = solve( ode=ric_eq, condition=init_vals_rc, train_generator=train_gen, t_min=t_min, t_max=t_max, net=fcnn_rc, batch_size=N, max_epochs=1500, optimizer=adam, monitor=Monitor(t_min= t_min, t_max= t_max, check_every=100), ) t_max2 = 4.0 N2 = 128 adam2 = optim.Adam(fcnn_rc.parameters(), lr=0.001) train_gen2 = ExampleGenerator(N2, t_min= t_min, t_max= t_max2, method="equally-spaced-noisy") solution_rc, _ = solve( ode=ric_eq, condition=init_vals_rc, train_generator=train_gen2, t_min=t_min, t_max=t_max2, net=fcnn_rc, batch_size=N2, max_epochs=5000, optimizer=adam2, monitor=Monitor(t_min= t_min, t_max= t_max2, check_every=100), ) # analytical solution compared to found solution ts = np.linspace(t_min, t_max2, 40) x_ana = 1+(np.sqrt(2)*np.tanh((np.sqrt(2)*(ts)+(np.log((-1+np.sqrt(2))/(1+np.sqrt(2))))/2))) x_nn = solution_rc(ts, as_type='np') plt.plot(ts, x_ana) plt.plot(ts, x_nn) plt.show() # MSE print(np.mean((x_ana-x_nn)**2))
here the training method is supervised or unsupervised ?
It's unsupervised because it doesn't depend on any data. Instead, we use randomly sampled points.
Like for IVP x(0)=0, x'(0)=0 the code is _IVP(t_0= 0.0, _x_0=0.0,x_0prime=0.0)_ what will be the code for IVP x''(0)=0,x'''(0) =0
Like for IVP x(0)=0, x'(0)=0 the code is _IVP(t_0= 0.0, _x_0=0.0,x_0prime=0.0)_
what will be the code for IVP
x''(0)=0,x'''(0) =0
The 2nd order initial value problem hasn't been implemented yet. Can I know what problem you are solving? There're very few problems (that I know of) that actually involve 2nd order IVP.
One thing you can try is to rewrite your PDE/ODE in terms of u=x'(0) if there is no integral terms of u involved after such rewriting.
y'''+y''+y'+y^2=0, y(0)=y'(0)+y''(0)=1
for training the network _train_gen = ExampleGenerator(N, t_min= t_min, t_max= tmax, method="equally-spaced-noisy") I just want to know exactly how many points network is trained? This means what is the difference between each training point?
y'''+y''+y'+y^2=0, y(0)=y'(0)+y''(0)=1
In this case, I would recommend rewriting the ODE as a system of first-order ODEs. Here's an example of this technique
Specifically, try let
y_0 = y
y_1 = y'
y_2 = y''
for training the network _train_gen = ExampleGenerator(N, t_min= t_min, t_max= tmax, method="equally-spaced-noisy") I just want to know exactly how many points network is trained? This means what is the difference between each training point?
For the first question, the total number of training points (for each epoch) is N
.
For the second question, I assume you are asking what the equally-spaced-noisy
part does. Basically, equally-spaced-noisy
does this for every epoch:
N-1
subintervals, which gives you N
endpoints t_1
, t_2
, ..., t_N
t_i
, add to it a random noise ε_i
, which is drawn from a Gaussian distribution with standard deviation noise_std
. If not specified, noise_std
defaults to (t_max-t_min)/4
.t_i + ε_i
for training the network _train_gen = ExampleGenerator(N, t_min= t_min, t_max= tmax, method="equally-spaced-noisy") I just want to know exactly how many points network is trained? This means what is the difference between each training point?
For the first question, the total number of training points (for each epoch) is
N
. For the second question, I assume you are asking what theequally-spaced-noisy
part does. Basically,equally-spaced-noisy
does this for every epoch:
- Equally divide the interval into
N-1
subintervals, which gives youN
endpointst_1
,t_2
, ...,t_N
- For each
t_i
, add to it a random noiseε_i
, which is drawn from a Gaussian distribution with standard deviationnoise_std
. If not specified,noise_std
defaults to(t_max-t_min)/4
.- Return
t_i + ε_i
ok You mean if t_min=0.0, t_max=1.0 , N= 11, and noise_std is not specified then net work is trained on points [0.25,0.26,0.27,-------1.25] Am I right?
Not exactly, it will train on {0+ε_0, 0.1+ε_1, 0.2+ε_2, ... 1.0+ε_10}
, where ε_i ~ N(0, 0.25)
is a random variable that follows the Gaussian distribution with 0 mean and 0.25 standard deviation.
I may have missed it, but does the library solve nonlinear ode? For example, can it solve dy / dx = 2 * y - y ^ 2 + 1 and y_0 = 0, a nonlinear Riccati Equation? When trying to solve with the specified codes, there were very wrong results, is there a trick to solve nonlinear equations with this library, or is there a method you can suggest?
Thank you very much.