Gillgren / Solving-Differential-Equations-with-Neural-Networks

9 stars 5 forks source link

Can it be used for solving second-order ODEs? #3

Open mach512 opened 4 years ago

mach512 commented 4 years ago

Hi,

I would like to use your code for solving a second-order ODE. I was wondering if it will give same or approximate result to that of numerical ODE solver?

Please let me know. I am interested to know if we can solve ODEs through neural networks?

Gillgren commented 4 years ago

Hi,

Yes it is possible to solve a second-order ODE. I guess you will have to try if it is as accurate as a "regular" numerical ODE solver. Look at BurgerEQ.py to see how to handle second order derivatives.

Cheers

mach512 commented 4 years ago

Hi,

Thank you very much for helping me out. In BurgerEQ.py, you have used analytic equation for the cost function, but I don't have the knowledge to represent my equation in analytic form. Can I use the numerical method (RK4) in place of analytic representation? Will it be possible for you to provide the BURGEREQ.py code with numerical solution than analytical solution for the cost function? Kindly let me know. Thank you.

Gillgren commented 4 years ago

Hi,

The main idea is that we employ the neural network to be the numerical solution to the differential equation. So the cost function is rather the neural network equation expressed in terms of the original differential equation with derivatives and everything. So we don't need the analytical solution or any other numerical method here. If we would need the analytical solution, then this method would be useless. Please read through the description of the repository as I have tried to explain it there. If something is unclear, let me know.

mach512 commented 4 years ago

Hi, Thank you for helping me out. I have this second order ODE: d2x/dt2=(-2xmbmdx/dt)-(km^2x)+Gmkm(Pm+C(2theta1)/1+np.exp(theta2*(theta3-x))), with boundary conditions f(0)=0.3, f(1)=0.3

Can you help me understand how to represent this in the cost function? I am slightly confused.

Thank you very much for your time and consideration.

mach512 commented 4 years ago

equation.docx

Gillgren commented 4 years ago

Hi,

I understand what you mean. So in your case, you would have to move all terms to one side so that you get 0=(-2xmbmdx/dt)-(km^2x)+Gmkm(Pm+C*(2theta1)/1+np.exp(theta2(theta3-x))) - d2x/dt2. Then you have to square the right side of the equation so that this always is positive. Then you want to minimize the right side so it goes towards 0, since then minimizing the cost function means that you are trying to satisfy the differential equation.

I think that absolutely most easiest way to understand is to look at ODE_example.py. In the top of this document, I've explained what differential equation I am trying to solve and how I actually set up the cost function, how I calculate the numerical derivative of the neural network function f (dfdx as derivative).

Cheers

mach512 commented 4 years ago

Hi, I have written the script for second-order ODE with your code. But, while running, the values were returned as 'nan'.

Can you kindly let me know if the script is written correctly or something wrong? I could not attach as .py file here, so I am sending the code in word document. code.docx

mach512 commented 4 years ago

Hi, Can you also kindly let me know how to define x and t in the code? Because, I know the timespan (t) of my simulation but how can I define 'x' in the code? What is 'x' here ? I kept 'x' as time and passed to the 'f' function. Is it correct?

Gillgren commented 4 years ago

Hi, Regarding the nan issue. It can happen during the training phase if the training becomes unstable. This is a known issue. Try changing momentum to 0 or something lower than 0.99.

Regarding x. In my code, x and t are the space and time domain that I'm solving the equation for. So x is similar to t in my code. In your equation, it looks like x is the actual solution that you are looking for. In my equation, f is the solution and as you can see, I have not defined the range of f in my code. In other words, you just define the range of the input parameter like space and time, and not the actual solution.

mach512 commented 4 years ago

Hi. Thank you very much for your valuable inputs. If it is not too much to ask, can you kindly run the code I sent to you and let me know your comments on it please? My simulation time is 0 to 5.15 with 4000 points in between. So, I should provide this as input parameter in the code. How will we know the space domain range before solving the equation? I wasn't clear with regard to the parameter 'x' passed to the 'f' function in the code. I am confused if I can use time as values for 'x' parameter in the 'f' function? Can you kindly elaborate little more? Also, can you please run the code I sent in the word document and let me know your thoughts.

Thank you very much for your time and consideration.

Gillgren commented 4 years ago

Hi, I did run your och code and I got nan in the beginning, but i changed learning_rate=0.000000000001 momentum=0.3. You have to experiment with what learning rate and momentum that works for your case. Otherwise I think your code is fine, you can absolutely pass the time values as x in the f function. If you only have the time as a parameter, then you only use that.

Gillgren commented 4 years ago

And it could be useful to look at PDE_example_pytorch.py because here I use predefined packages so that we avoid things like nan. This is probably a lot better to improve the training process.

mach512 commented 4 years ago

Hi, I am trying to work with your solution. But in my case, the loss value is too big and it decreases very slowly. It starts with this value 6885479936.000000 and then decreases 100 points for every 100 iteration. How can I resolve it?

In your implementation, why haven't you considered adam optimizer? I am not able to solve the problem with the current values. I cannot change my learning rate and it is having a value of 10^-10, which is not reasonable I think. This huge value is because of the nest_momentum function?

Kindly let me know. Thank you.

Gillgren commented 4 years ago

Hi,

I do not know exactly how to solve your specific problem, you have to experiment and find what works for you. I am just showing you what I have done to solve my problems.

I have considered the adam optimizer many times, it was not just in the code. You are free to code adam if you want to. If you use the pytorch, as I suggested before, you can surely experiment with adam and other things and see what works for you.

Cheers

mach512 commented 4 years ago

Hi, Thank you very much for your reply. Yes, I have to find the solution for this. But, just a general question, why my loss value is so huge? Is that normal? Did you get such huge values while training the model? Kindly let me know.

Also, I have tried to recode the adam optimizer with your method. The code is like this, it gives an error stating that Gradient only defined for scalar-output functions. Output had shape (1,1). Will it be possible for you to run the code and check please?: import sys import numpy as np import jax.numpy as np from jax import random from autograd import elementwise_grad from jax.experimental import optimizers

from jax import grad from jax import vmap from jax import jit import itertools import time import matplotlib.pyplot as plt from mpl_toolkits import mplot3d

bm=1 km=(1/50) Gm=3.25 Pm=10 tcal=np.linspace(0,5.15,4000)

def init_nn_params(scale, layer_sizes, rs=npr.RandomState(0)): """Build a list of (weights, biases) tuples, one for each layer.""" return [(rs.randn(insize, outsize) scale, # weight matrix rs.randn(outsize) scale) # bias vector for insize, outsize in zip(layer_sizes[:-1], layer_sizes[1:])] def swish(x): return x/(1.0+np.exp(-x)) def f(params, inputs): for W,b in params: outputs=np.dot(inputs,W)+b inputs=swish(outputs) return outputs

init_params=init_random_params(0.1,layer_sizes=[1,10,1]) dfdx=grad(f,1) d2xdt2=grad(dfdx,1) f_vect=vmap(f,(None,0)) dxdt_vect=vmap(dfdx,(None,0)) d2xdt2_vect=vmap(d2xdt2,(None,0))

def loss(params,t,km,bm,Gm,theta,cval,Pm): eq=d2xdt2_vect(params,t)+((2bmdxdt_vect(params,t))/km)+(f_vect(params,t)/km2)-((Gm/km)(cval((2theta[11])/(1+jaxnp.exp(theta[12](theta[13]-f_vect(params,t)))))+Pm)) print('eq:', eq) bc1=f(params,0)-0.3 bc2=f(params,1)-0.3 print('parameter:',np.mean(eq2),bc12,bc22) return np.mean(eq2)+bc12+bc2**2 def callback(params,step): if step%1000==0: print("Iteration {0:3d} objective {1}".format(step,loss(params,tcal,km,bm,Gm,theta,C,Pm)))

grad_loss=jit(grad(loss,0)) step_size=1e-2 opt_init, opt_update, get_params=optimizers.adam(step_size)

def update(i, opt_state): params = get_params(opt_state) getlossval=10#callback(params,i) return [opt_update(i, grad(grad_loss)(params,tcal,km,bm,Gm,theta,C,Pm), opt_state),getlossval]

opt_state=opt_init(init_params) itercount = itertools.count() for epoch in range(5): print('itercount:', itercount) start_time = time.time() [opt_state, findloss] = update(next(itercount), opt_state) epoch_time = time.time() - start_time