Open BijanGithub opened 2 years ago
In the DSMC data set, at each iteration, the evolution of three groups (0,1,2) is written, e.g.:
Inside the trainer code, i.e. "trainer.jl" we have:
I see that X and Y are trainer sets. My impression from the above code is that its assumption is that there are two groups in the system. If that is correct, for the current DSMC data set inside the github, I should modify it to : X Y Z Am I correct?
Hi @BijanGithub
The neural network can be understood simply as a function. I name the input is named as X, and the function output is NN(X). NN has trainable parameters, so we would like to create a cost function: C = ||NN(X) - Y||, where Y is the ground-truth data. So as you can see here, X is the status of particles at one time step, and Y is their post-collision status computed from your DSMC codes.
Hi @vavrines
I have a question about model loading.
Suppose I have trained the model for one problem and saved it as @save "model.jld2" nn
.
Later I want to train it more on another occasion. So I load the model via @load "model.jld2" nn
.
Next, I want to use new (X,Y) data and train the loaded model with these new data. I do not want to train a new or a fresh model, which has no learned experience before. How I should modify the following command to fulfill this purpose?
sci_train!(nn, (X, Y), ADAM(); device = cpu, batch = 16, epoch = 5000)
Thanks in advance!
Hi @BijanGithub A network stores its parameters and sci_train! function will directly train on these parameters. You don't need to do any additional treatment.
Thanks @vavrines ,
To have a clear view of the model there is another question:
Now I train it for the evolution of three groups of gases inside a cavity and only for their temperature values [T1; T2; T3]. Suppose the model is the function "nn(X_i)" where X_i[T1i,T2i,T3i].
It is not possible to use the already well-trained model for a 3 group case and apply it for a group of arbitrary n members? If yes, it means that for any arbitrary group of n members, the training should be done independently, am I right?
Thanks, Bijan
@BijanGithub I think it's impossible for fully connected NN. This kind of network takes fixed sized input, e.g. the picture resolution or text length. For other NNs it may be feasible to do so. For example, in natural language processing, the text length can vary. For now, I suggest we stick on fixed size and see how the model performs. If it works well, we can try to leverage other architecture like transformer.
Dear Tianbai (@vavrines )
In this issue, I will ask questions about the Julia code.
Thanks a lot for your help.