barbagroup / jcs_paper_pinn

PINN paper that will be submitted to Journal of Computational Science
8 stars 0 forks source link

Reviewer 1 comments #3

Open piyueh opened 6 months ago

piyueh commented 6 months ago

The manuscript titled 'Predictive Limitations of Physics-Informed Neural Networks in Vortex Shedding' explores the application of Physics-Informed Neural Networks (PINN) in solving partial differential equations (PDEs), a topic that has garnered considerable attention in the community. The authors aim to elucidate the limitations of PINN in accurately predicting the complex behavior of systems. Specifically, the paper highlights PINN's inability to predict the vortex shedding phenomenon in 2D incompressible Navier-Stokes equations at a high Reynolds number (Re=200).

Given the current lack of complete understanding of the internal structure and optimization process of neural networks, it will be challenging to delve into the fundamental limitations of the PINN. However, through a series of experiments, the authors investigated the accuracy and efficiency of the PINN. The paper rigorously analyzes the predicted result of the PINN, providing a preliminary glimpse into the nature of phenomena where PINN might encounter difficulties.

As I proceed with my review, I have several questions and points for clarification throughout the manuscript.

Here are some minor, technical comments.

Others

piyueh commented 5 months ago

2D TGV

There is a noticeable absence of an explanation regarding how the periodic boundary condition is incorporated into the loss term. While the manuscript aptly illustrates the construction of boundary loss for Dirichlet and Neumann boundary conditions, addressing the methods related to the periodic case would enhance the clarity of the presented work.

My comment:

The handling of periodic is described in the dissertation section 3.2.2:

Consider a pair of periodic BCs on $x=x_1$ and $x=x_2$. We notice that $\sin(2\pi\frac{x-x_1}{x_2-x_1})$ and $\cos(2\pi\frac{x-x_1}{x_2-x_1})$ have a period of $x_2-x_1$. Hence, we expand the inputs of $G$ to

$$G = G(\sin(2\pi\frac{x-x_1}{x_2-x_1}), \cos(2\pi\frac{x-x_1}{x_2-x_1}), y, z, t; \Theta)$$

If component $y$ or $z$ also has periodic BCs, they are converted as well, following the same logic. In the present work, the periodic BCs are handled using this approach. This approach builds the periodicity into the model directly; hence no BC loss terms are needed for periodic BCs.

Added in commit: 41d8f39ef2126c5f217a2e6efc5c27fa9c81c58b


Concerning the experimental results, the paper reports errors at t=0 and t=40. Given that t=0 corresponds to the initial condition, minimal error at this point is naturally expected due to the associated loss term. However, it would be valuable to explore whether the errors at t>0 are consistent with those at t=40 or if there is a discernible trend indicating an increase in error as time progresses. This analysis could provide insights into the extrapolation capabilities of PINN to later times.

My comment:

We have addressed a little bit of this matter in our 2022 SciPy paper. See figure 4 in that proceeding paper.

Figure 4.3 and 4.4 in the dissertation also some how imply the results.

The figures in both the SciPy paper and the dissertation indicate that the error is indeed much lower at $t=0$ because we have exact solution (i.e., the initial conditions) to train the neural network at this time. The error dramatically becomes 1- to 2-order of magnitude higher at $t > 0$ but does not increase further over $t$. It indicates the accuracy at $t=0$ does not have strong influence for that at $t>0$, though the method is bounded in time. Moreover, the PINN method present here is a spatial-temporal method, and it does not have the traditional sense of discretization. It does not iterate and progress in time as classical numerical methods do. So whether it makes sense to discuss the error is bounded or not and how this error-versus-time behavior means are unclear to us.

Should mention the reference to the previous in the paragraph of the 1st column on page 5.

LB comment The paper is long, and we don't want to add more figures if we can help it. So we want to respond to the reviewer, but not sure there is anything to add in the paper, except for a note to look at the SciPy paper.

Added in commit fd69cbb and 0639f7b

piyueh commented 5 months ago

2D Cylinder

I am having difficulty comprehending the experimental setup for the Re = 200 case. In the data-driven PINN, the manuscript treats t=125 as the initial condition, as the vortex is triggered and stabilized around that time. However, for the unsteady PINN, t=0 is considered as the initial condition. This discrepancy raises concerns about the fairness of comparison. Could the authors provide clarification on the rationale behind this choice, and was the consideration of using t=0~15 as the initial condition for data-driven PINN or t=125 as the initial condition for unsteady PINN explored? Such adjustments could potentially impact the results presented in Fig.11 and others.

My comment:

The reason we have a case of data-driven PINN with training data at $t \in [125, 140]$ is that the regular PINN the only uses $t=0$ data does not show vortex shedding. One common reason for a traditional CFD solver not being able to produce vortex shedding is the perfect symmetric configuration in initial conditions and the spatial discretization. So to eliminate the perfect initial conditions out from the possible reasons that PINNs could not generate the vortex shedding, one quick test is to use results that already have shedding as the initial conditions, and make a simulation continue running from these results. We adopted this approach to PINN as well. If PINNs are able to produce vortex shedding, then we should see the vortex shedding to continue after $t > 140$, where we don't provide any training data to the PINN. Our result shows that the PINN's result quickly becomes shedding-free, which indicate the reason for PINNs not being able to generate shedding is not related to the initial conditions. It is more likely to be something native in the PINNs.

One clarification to make here is that, when we only use data from $t=0$ to train PINNs, we call them data-free PINNs (or simplify denote them as unsteady PINNs in this paper) because data at $t=0$ are not generated from any experiments or computer simulations. Data at $t=0$ are usually just zeros ore ones everywhere. And when we, for example, have training data of $t \in [125, 140]$, we call them data-driven PINNs because these data must be generated/obtained from other sources.

First, we are not "comparing" the unsteady data-free and data-driven PINNs. Out intention was merely to know if using a solution that already has shedding, if PINNs can generate shedding. If both PINNs can not produce shedding, then comparing anything (e.g., performance, accuracy, etc) is meaningless.

Second, regarding training against t=0 to 15, we don't believe anything will change for the data-driven PINNs as all PINNs have no issue predicting solution at t=0 to 15. And solutions in this time range does not have shedding, so we don't believe it will eventually generate shedding eventually with training data in t=0 to 15.

See commit afbe288, 1b55cccdbc37df5b087feb2ab77e15b97b8742a8

Additionally, some experimental results have left me puzzled. In the Re=40 case, the loss of the unsteady solver in Fig.6 is notably higher than that of the steady solver. Nevertheless, in Fig.7, the results from the unsteady solver appear to be more consistent with PetIBM's results. I assume that the aggregated losses in Fig.6 are the losses for the training, but it remains unclear if these values represent the model's accuracy. Could the authors clarify whether these losses are indicative of model accuracy or if there is another metric that should be considered?

Similarly, it is intuitively expected that the unsteady solver should outperform the steady solver at Re=200 due to the time-varying solution with vortex shedding. However, this expectation is not reflected in the losses shown in Fig.10, making it unclear which solver is performing better.

My comment:

No, the training losses do not strictly reflect the prediction accuracy. The definitions of errors or accuracy are defined by the final predictions and the reference solutions, which are irrelevant to how how we optimize for the model parameters. On the other hand, the losses are related to how we carry out the optimization for the model parameters. For example, in the cases of steady and unsteady PINNs, the optimization objectives are different in the way that the latter one has loss term that comes from the initial conditions. So errors and losses have some correlation but not necessarily reflect each other, especially when comparing different models and optimization approaches.

To our knowledge at this moment, we are not aware of any other metrics during training that can hint on the prediction accuracy/errors. If we want to mimic the validation process commonly seen in regular machine learning, we can monitor the errors against initial conditions and boundary conditions. However, we argue that this metric may not be meaningful because as stated previously that errors at $t=0$ have little influence on those at $t>0$.

See commit 810c6d8358bb7c66223acdc2a4129f16f546333b

piyueh commented 5 months ago

Discussion

My understanding of the discussion around numerical noises triggering vortex shedding in trditional CFD is hindered by my limited familiarity with the field. I believe that in traditional CFD such as PetIBM, numerical noise arises due to the recurrent calculation of the next state based on the current state. However, in the PINN context, predictions are made instantaneously when given position and time coordinates, rather than through recurrent calculations. Considering the hypothetical scenario of a 'perfect' PINN with infinite numerical precision and zero aggregated loss, wouldn't it be reasonable to expect a vortex-free prediction from such an idealized model?

My comment:

Yes, we intuitively believe this would be the case. However, we are not able to prove this hypothesis mathematically, so we tried our best not to make such a affirmative claim.

piyueh commented 5 months ago

Others

Finally, acknowledging that this may extend beyond the scope of the paper, I am personally intrigued and curious. Do the authors have contemplated selectively sampling points for PDE loss, particularly in regions where complex phenomena like the triggering of vortex are anticipated? Considering the cylinder problem, the intricacies are expected just behind the cylinder, and indeed, this is the area where the discrepancies of data-driven PINN results are observed. I wonder if targeted and focused training in this specific domain could potentially yield successful PINN predictions.

My comment:

We have conducted some preliminary studies of using more points in regions of high velocity gradients, which include the region behind the cylinder in the cylinder flow. We did not observe any difference in these preliminary studies. However, due to limited time, we did not further study this direction.

LB Let's look into this new preprint, and refer to it on the response to the reviewer:

— Yang, S., Kim, H., Hong, Y., Yee, K., Maulik, R. and Kang, N., 2024. Data-Driven Physics-Informed Neural Networks: A Digital Twin Perspective. arXiv preprint arXiv:2401.08667.

piyueh commented 5 months ago

Minor issues

The phrase '... suggests that PINN is numerically dispersive and diffusive' in the abstract could be refined for accuracy. Consider '...suggests that the results of the PINN exhibit numerical dispersion and diffusion.' This adjustment would accurately convey that the analyses pertain to the output of a trained PINN, not the PINN itself.

Edited in commit: 16ae538252fb7ba784f62b00c8759278da66f9b7

In Fig.1, the presence of numerous arrows contributes to a visually complex presentation. Simplifying the figure by grouping losses based on their origins could enhance clarity for readers.

Edited in commit: f8c46c71e88a818c89dec327531ff6b931736227

The term 'unsteady PINN solver' is introduced in Section 3.1, with clarification provided only in Section 3.2 regarding both unsteady and steady PINN. Consider providing an early explanation or reference for an 'unsteady PINN solver' to improve coherence.

Edited in commit: 00520ca0bcf0e8c90ff2bb0fb2d7d09a3186d7a7

While the authors may already be aware, pytorch allows for double-precision floats with commands like 'torch.set_default_dtype(torch.float64)'. However, I don't expect this adjustment to significantly alter the presented results.

Edited in commit: 35716c9b141c5c9b96b674424857a70076c3d71e

In Figs. 6 and 7, the opposite colors of the steady and unsteady solvers may cause confusion. Consider unifying the colors for consistency. The same suggestion applies to Figs 10 and 11.

Edited in commit: 8fff958c0b4f52dd4d76fa52eeedd29436ea2b42

In Figs. 17 and 18, adding the titles of PetIBM and Data-driven PINN on the left and right sides, as done in Figs. 19 and 20, would enhance the visual consistency of the figures.

Edited in commit: 8fff958c0b4f52dd4d76fa52eeedd29436ea2b42