holl- / PDE-Control

Code for the ICLR 2020 paper "Learning to Control PDEs"
MIT License
29 stars 8 forks source link

About Legacy 2.4 Classical optimization: smokeoverfit.py #3

Open kuwt opened 3 years ago

kuwt commented 3 years ago

I am trying to run smokeoverfit.py. However, I run into the problem that I cannot find the file seq_vecpot.npy. I manage to run shapegen.py but it just produce sim_00000X with Density_0000XX.npz files. I would like to know if there are data generation scripts that provide the files: seq_vecpot.npy seq_pred.npy seq_real.npy

holl- commented 3 years ago

Those files contain arrays of shape (batch_size, frames, y, x, 1). They should be produced after training but it looks like the file that generates them is missing. I'll look into it but I'm not sure whether I can find the script again. However, I strongly recommend you use the updated code, not marked as legacy.

kuwt commented 3 years ago

I have studied the updated code. Since I am currently reading the article "LEARNING TO CONTROL PDES WITH DIFFERENTIABLE PHYSICS", I am quite confused how the classical optimization is done exactly since it is not mentioned n detail in the article. Therefore I try to look into the legacy to see if I can find some information.

holl- commented 3 years ago

Ah, I see. It's actually quite simple, it just uses Adam to directly optimize the forces using basically the same loss function as for the network. The loss is defined here.

kuwt commented 3 years ago

What are the parameters to be optimized in this case? is it the velocity potential? Is this trained velocity potential, i.e.hierarchical_vec_pot, a time step dependent function depending on some predefined time step as stated in target_iterations? If yes, I see that there are just 5 levels of predefined time step. Can this 5 stages velocity potential produce meaningful values to get the flow to the target observable state or shape? Why isn't it that a per time step velocity potential is utilized? Im sorry that I have so many questions.

holl- commented 3 years ago

No worries! Yes, the optimized quantity is the velocity potential from which the velocity is derived in hierarchical_vec_pot(). This is just a trick to avoid performing pressure solves all the time. You could just as well optimize the velocity values directly and it might even converge with less iterations. The function hierarchical_vec_pot is independent of time, it just computes the curl of a vector field. The hierarchical structure is necessary for optimization-related reasons only. Without it, long-range shifts would be incredibly hard to learn. So instead it also optimizes the 128x128 high-resolution version plus 64x64, 32x32, 16x16 and 8x8 downscaled version of the scene.

kuwt commented 3 years ago

Can I get a clarification that if my understanding of the function hierarchical_vec_pot() and smokeoverfit.py is correct or not?

  1. I think that hierarchical_vec_pot() takes an input velocities field and produce a target velocities field through the use of velocity potential. For each iteration in the sequence n, hierarchical_vec_pot() produces a different velocity field using independent parameters in order to minimize the total force exerted.

  2. After training, the total force exerted should be minimized so the network now produces output which resembles a velocity sequence governed by natural flow. But of course this kind of network is useless since it is optimized only for a single case so if another initial velocity is input, the network fails to generate the natural flow.

holl- commented 3 years ago

That's pretty close. The solution you get is only valid for one example. However, in this example, there is no network involved at all, the vector potentials themselves are optimized. But you're right in that you could as well have a neural network predict the potential and optimize the network paramters. The only difference would be the parametrization of the solution. Actually, with neural networks, you could do away with the vector potential hierarchy since the network architecture would supposedly already contain parameters with long-range effects.