vavrines / SciML-DSMC

Direct simulation Monte Carlo with scientific machine learning
MIT License
1 stars 1 forks source link

Some questions #1

Open BijanGithub opened 2 years ago

BijanGithub commented 2 years ago

Dear Tianbai (@vavrines )

In this issue, I will ask questions about the Julia code.

Thanks a lot for your help.

BijanGithub commented 2 years ago

In the DSMC data set, at each iteration, the evolution of three groups (0,1,2) is written, e.g.:

image

Inside the trainer code, i.e. "trainer.jl" we have:

image

I see that X and Y are trainer sets. My impression from the above code is that its assumption is that there are two groups in the system. If that is correct, for the current DSMC data set inside the github, I should modify it to : X Y Z Am I correct?

vavrines commented 2 years ago

Hi @BijanGithub

The neural network can be understood simply as a function. I name the input is named as X, and the function output is NN(X). NN has trainable parameters, so we would like to create a cost function: C = ||NN(X) - Y||, where Y is the ground-truth data. So as you can see here, X is the status of particles at one time step, and Y is their post-collision status computed from your DSMC codes.

BijanGithub commented 2 years ago

Hi @vavrines I have a question about model loading. Suppose I have trained the model for one problem and saved it as @save "model.jld2" nn. Later I want to train it more on another occasion. So I load the model via @load "model.jld2" nn.

Next, I want to use new (X,Y) data and train the loaded model with these new data. I do not want to train a new or a fresh model, which has no learned experience before. How I should modify the following command to fulfill this purpose?

sci_train!(nn, (X, Y), ADAM(); device = cpu, batch = 16, epoch = 5000) Thanks in advance!

vavrines commented 2 years ago

Hi @BijanGithub A network stores its parameters and sci_train! function will directly train on these parameters. You don't need to do any additional treatment.

BijanGithub commented 2 years ago

Thanks @vavrines ,

To have a clear view of the model there is another question:

Now I train it for the evolution of three groups of gases inside a cavity and only for their temperature values [T1; T2; T3]. Suppose the model is the function "nn(X_i)" where X_i[T1i,T2i,T3i].

It is not possible to use the already well-trained model for a 3 group case and apply it for a group of arbitrary n members? If yes, it means that for any arbitrary group of n members, the training should be done independently, am I right?

Thanks, Bijan

vavrines commented 2 years ago

@BijanGithub I think it's impossible for fully connected NN. This kind of network takes fixed sized input, e.g. the picture resolution or text length. For other NNs it may be feasible to do so. For example, in natural language processing, the text length can vary. For now, I suggest we stick on fixed size and see how the model performs. If it works well, we can try to leverage other architecture like transformer.