Open Wolpes11 opened 2 years ago
Yes, it is possible, you can use PFNN, as mentioned in the example "elliptic_inverse_field.py".
Thank you so much! It should work for my problem.
Instead of starting a new thread, I have just another quick question. How do I have to define boundary/initial condition simply based on data? I have a grid of measurements at different position x and time t, I just want to define as boundary conditions values at x=x_start and x=x_end and as initial condition measurements at t=0.
Thank you Dr. Lulu for your reply and thank you for your really great work!
I've just the last question to bother you. I have 157,511,520 data points that in single precision are ~630 MB and I'm working with a GPU with 80GB of memory. But the code crash at epoch=0 for "Out Of Memory Error". What could it be the problem?
Is CPU out of memory or GPU?
I think is GPU. I'm working with 2TB of memory on CPU and it can load the data points correctly. The model compiles correctly and the code prints the loss at epoch=0, but then it crashes because it tries to allocate many other tensors of size [2xN_points, N_neurons_per_layer], if I see correctly.
Thanks again for your time.
Then the solutions I can think is either using a smaller dataset, or use mini-batch.
I tried using ResidualResampler but it doesn't work. I'm not even sure I'm using it correctly. Here it is the part of the code where I'm implementing the model.
geom = dde.geometry.Interval(1, 2)
timedomain = dde.geometry.TimeDomain(0, 1)
geomtime = dde.geometry.GeometryXTime(geom, timedomain)
observe_x, y = gen_traindata()
BC = dde.PointSetBC(observe_x[(observe_x[:,0]==1.)|(observe_x[:,0]==2)], y[(observe_x[:,0]==1)|(observe_x[:,0]==2)], component=0)
IC = dde.PointSetBC(observe_x[observe_x[:,1]==0], y[observe_x[:,1]==0], component=0)
TP = observe_x[(observe_x[:,0]!=1)&(observe_x[:,0]!=2)]
data = dde.data.TimePDE(
geomtime,
pde,
[BC, IC],
num_domain=0,
num_boundary=0,
num_initial=0,
anchors=TP
)
net = dde.maps.PFNN([2, [30, 30, 30], [20, 20, 20], [20, 10, 10], [20, 1, 1], [20, 1, 1], 3], "tanh", "Glorot uniform")
model = dde.Model(data, net)
model.compile("adam",lr=0.002)
resampler = dde.callbacks.PDEResidualResampler(period=100)
losshistory1, train_state1 = model.train(epochs=10000, callbacks=[resampler])
Thank you!
PDEResidualResampler
only works for the points sampled by DeepXDE, not for points provided by users via anchors
. You domain seems small, and maybe a small dataset is enough.
Thank you so much for your help.
I tried also with batch_size, instead of PDEResidualSampler
but the problem remain the same.
The domain is so small because I normalized both space and time. Otherwise my domain spans (500m,550m) in space and more than 20 days in time with a resolution of 1s.
I tried also with batch_size, instead of
PDEResidualSampler
but the problem remain the same.
Do you mean using a small training data and you still have the OOM error? This seems strange.
I tried setting a batch size when I train the model:
losshistory1, train_state1 = model.train(epochs=10000, batch_size=1)
But it gives me the same OOM error.
The problem is that the maximum size of training set that I can fit in memory does not allow to capture all the features of the data (such as long period oscillations). Thank you!
@Wolpes11 I also met a similar problem...I have a lot of experimental data and I can not fit it all into the memory...For the normal ANN in Tensorflow.Keras, we can easily use the argument "batch_size" for the mini-batch to deal with OOM...However, for PINN in deepXDE, I think the implementation of the mini-batch for the PointSetBC has not been done yet....if you dig into the source code of the function "train()", there is not thing implemented for the argument "batch_size" for PINN...
Thank you @haison19952013 ! Yes, I had a look at the source code. However, what makes me puzzled is that the entire data set actually fits in memory (it is more or less 600MB) and then the OOM error arises at the beginning of the training phase. The architecture of the neural net should be fixed, independently of the number of samples, 2 inputs (space and time) and 1 output, right?!
Thank you @haison19952013 ! Yes, I had a look at the source code. However, what makes me puzzled is that the entire data set actually fits in memory (it is more or less 600MB) and then the OOM error arises at the beginning of the training phase. The architecture of the neural net should be fixed, independently of the number of samples, 2 inputs (space and time) and 1 output, right?!
Yes, the architecture of the neural net is fixed and I think there is no problem with the architecture. In my case, after the epoch "0", the OOM will occur. For your case, I think your data can fit into your system. However, after epoch "0", the model will have to calculate and store a lot of information, especially the gradient information for the PDE of all data in one time. I think this can be a main reason leading to the OOM. What do you think ?
P/s: If you really want to use the mini-batch right now, I think you can consider the paper "Hidden Fluid Mechanics" which is provided with source code. The author also use a lot of data and apply the mini-batch for training
batch_size
in model.train
doesn't work for PINN, because in PINN there are different types of training points, such as PDE points, BC points, IC points etc. So it is not clear here what batch_size
really means.
The current DeepXDE version supports mini-batch of PDE residual points via PDEResidualSampler
, but the training points provided by anchors
will not be used as mini-batch. In order to also do a mini-batch for anchors
, you need to modify the source code of the following line:
https://github.com/lululxvi/deepxde/blob/4714a1f4268489c7d2e50302ddefd54a8aa5defb/deepxde/data/pde.py#L237
Instead of using all the points in self.anchors
, you can simply randomly pick a subset of self.anchors
. Then PDEResidualSampler
will also work for anchors
points.
Thank you @haison19952013 ! Yes, I had a look at the source code. However, what makes me puzzled is that the entire data set actually fits in memory (it is more or less 600MB) and then the OOM error arises at the beginning of the training phase. The architecture of the neural net should be fixed, independently of the number of samples, 2 inputs (space and time) and 1 output, right?!
Yes, the architecture of the neural net is fixed and I think there is no problem with the architecture. In my case, after the epoch "0", the OOM will occur. For your case, I think your data can fit into your system. However, after epoch "0", the model will have to calculate and store a lot of information, especially the gradient information for the PDE of all data in one time. I think this can be a main reason leading to the OOM. What do you think ?
P/s: If you really want to use the mini-batch right now, I think you can consider the paper "Hidden Fluid Mechanics" which is provided with source code. The author also use a lot of data and apply the mini-batch for training
Yes, I agree. I think that the main problem is with the temporal gradient which concerns the entire data set. Thank you for your suggestion. I will have a look at the paper "Hidden Fluid Mechanics", even if I found that the package DeepXDE is easier to adapt for different PDE and applications.
Dear @lululxvi, I modified the source code as you suggested.
idx = np.random.randint(0, len(self.anchors)-1, 1000)
X = np.vstack((self.anchors[idx,:], X))
But I get the same OOM error. Am I doing something wrong? Thank you!
It looks OK. You may check the size of X
.
Yes, I have already check it. It is correctly (1000, 2). It seems that the point where the code crashes due to OOM error is not this one.
Thank you!
Hi Dr @lululxvi, I tried different codes with the dataset I'm using and the batch "trick" you suggested works fine. Do you have any clue why in this case the randomization of the training points does not work?
Thanks in advance!
Hi Dr @lululxvi, I tried different codes with the dataset I'm using and the batch "trick" you suggested works fine. Do you have any clue why in this case the randomization of the training points does not work?
What do you mean by "the batch trick" and "randomization of the training points"?
Hi Dr @lululxvi, I tried different codes with the dataset I'm using and the batch "trick" you suggested works fine. Do you have any clue why in this case the randomization of the training points does not work?
What do you mean by "the batch trick" and "randomization of the training points"?
The following modification of your source code, as you previously suggested:
idx = np.random.randint(0, len(self.anchors)-1, 1000)
X = np.vstack((self.anchors[idx,:], X))
Thank you!
You may directly check what is passed as the network input during training, and then figure out step by step what goes wrong in your code.
Dear Dr. lulu,
I have questions regarding DeepXDE. I would like to solve an inverse problem for a PDE with coefficients which are functions of space and time A(x,t) and B(x,t). Is this possible? All the examples that I found use constant (scalar) coefficients.
Do you have an example for this situation, please? Thank you in advance for your time!