barbagroup / snake-repro

An article reporting the travails of reproducibility in unsteady CFD studies
https://lorenabarba.com/news/reproducible-and-replicable-cfd-its-harder-than-you-think/
26 stars 4 forks source link

Reviewer 1 comments #7

Open labarba opened 7 years ago

labarba commented 7 years ago

Recommendation: Accept If Certain Minor Revisions Are Made

Comments:

I think the paper is great, but I would love for them to make it even better. This should be strongly encouraged. This should not preclude publication. Getting this work published is essential.

Additional Questions:

1) Please summarize what you view as the key point(s) of the manuscript and the importance of the content to the readers of this periodical :

2) Is the manuscript technically sound? Please explain your answer in the Detailed Comments section. : Yes

3) What do you see as this manuscript's contribution to the literature in this field?: This paper is seminal on an essential aspect of modeling & simulation. Many of the issue are elucidated eloquently here. We need to publish results like this and our failure to do so is holding the field back.

4) What do you see as the strongest aspect of this manuscript?: This is an important paper and topic. It absolutely needs to be published and read by the community at large. The message is important and it is delivered in a compelling and deeply engaging manner. The importance of publishing this sort of “negative” result cannot be overstated. Nonetheless a few weaknesses in the paper help to undermine the message enough that some corrective action is necessary. With a handful of changes this paper can become a real touchstone for the community to look to for paths for improving our computational practice. The suggestions below are far too expansive to be readily implemented, but are offered in the spirit of constructive criticism. I hope the authors can find the energy to move forward on some of these fronts and make an already excellent paper exceptional.

5) What do you see as the weakest aspect of this manuscript?:

References

Oberkampf, William L., and Christopher J. Roy. Verification and validation in scientific computing. Cambridge University Press, 2010.

Gresho, Philip M., and Robert L. Sani. "On pressure boundary conditions for the incompressible Navier‐Stokes equations." International Journal for Numerical Methods in Fluids 7.10 (1987): 1111-1145.

Sani, Robert L., and Philip M. Gresho. "Résumé and remarks on the open boundary condition minisymposium." International Journal for Numerical Methods in Fluids 18.10 (1994): 983-1008.

Metropolis, Nicholas, and Eldred C. Nelson. "Early Computing at Los Alamos." Annals of the History of Computing 4.4 (1982): 348-357.

Mattsson, Ann E., and William J. Rider. "Artificial viscosity: back to the basics." International Journal for Numerical Methods in Fluids 77.7 (2015): 400-417.

Quirk, James J. "A contribution to the great Riemann solver debate." Upwind and High-Resolution Schemes. Springer Berlin Heidelberg, 1997. 550-569.

Ioannidis, John PA. "Why most published research findings are false." PLoS Med 2.8 (2005): e124.

  1. Does the manuscript contain title, abstract, and/or keywords?: Yes
  2. Are the title, abstract, and keywords appropriate? Please elaborate in the Detailed Comments section.: Yes
  3. Does the manuscript contain sufficient and appropriate references (maximum 12-unless the article is a survey or tutorial in scope)? Please elaborate in the Detailed Comments section.: Important references are missing; more references are needed
  4. Does the introduction clearly state a valid thesis? Please explain your answer in the Detailed Comments section.: Yes
  5. How would you rate the organization of the manuscript? Please elaborate in the Detailed Comments section.: Satisfactory
  6. Is the manuscript focused? Please elaborate in the Detailed Comments section.: Satisfactory
  7. Is the length of the manuscript appropriate for the topic? Please elaborate in the Detailed Comments section.: Could be improved
  8. Please rate and comment on the readability of this manuscript in the Detailed Comments section.: Easy to read
  9. Please rate and comment on the timeliness and long term interest of this manuscript to CiSE readers in the Detailed Comments section. Select all that apply.: Topic and content are likely to be of growing interest to CiSE readers over the next 12 months

Please rate the manuscript. Explain your choice in the Detailed Comments section.: Excellent

labarba commented 7 years ago

Addressing reviewer comment # 1:

Perhaps the biggest issue to more completely address in the paper is the fundamental nature of the physical problem being solved. How well-posed is the problem, what sort of natural variability may be expected, and how reproducible are results intrinsically? One might look at a complementary study of vortex shedding off of a cylinder where the behavior is well-known. Are the results reproducible at very low Reynolds number? And do they cease being so as the Reynolds number increases, and if so at what value do the problems seen in the paper begin?

I read over Williamson (1996) again, looking for clues to frame the reply to this comment. It's true that bluff-body wakes are particularly complex flows—they involve three shear layers: the boundary layer, separating free shear layer, and the wake. Physically, we can talk about instabilities in all of these. The free shear layer, for example, is subject to Kelvin-Helmholtz instability (principally a 2D mechanism). In the case of the wake, von Karman showed in 1912 that vortex streets are unstable (although assuming infinite vortices and no body).

There are three vortex-shedding regimes: a "stable" periodic one for Re=40–150; a transition at Re=150–300, and an "irregular" regime at Re=300–10,000. We are squarely in the "irregular" regime.

Finally, Williamson cites work by Karniadakis and Triantafyllou (1992) suggesting that as Re increases "the wake velocity fluctuations indicate a cascade of period-doubling bifurcations, which create a chaotic state in the flow at around Re=500."

Yet, I don't think any of this is pertinent to the message of our paper. Yes our flow situation is complex. But you don't need a highly unsteady complex flow to be faced with challenges to reproducibility.

Take a simple saddle point in an ideal 2D flow. The dynamical system is linear and you can write down an analytical solution easily. But if you try to solve it numerically, it's going to be very sensitive to small errors. One initial condition right next to the stable manifold will come very close to the critical point, then go to infinity. Another initial condition very, very close, but on the other side of the manifold, will end up in minus infinity.

Any time you have exponential growth in the model, there will be huge challenges to numerical reproducibility. This is not a feature of complex vortical flows, in particular.

We can have long discussions about these issues, but it seems (to me) beyond the scope of our paper.

We already say that "highly unsteady fluid dynamics is a particularly tough application for reproducibility [...] In any application that has sufficient complexity, we should repeat simulations checking how robust they are to small variations." (p. 9).

"how reproducible are results intrinsically?" Well, there is no randomness in the model: the 2D Navier-Stokes equation should give deterministic results with sufficiently fine discretization, no?

"at what value do the problems seen in the paper begin?" We computed with two different values of Re: 1000, and 2000. The difficulties appear at the higher value of Re. We're not going to repeat this painful process with values of Re in between! ... we already made hundreds of simulations, and this is not the point of the paper, right? We're trying to replicate a previously published finding at Re=1000 and 2000.

In summary, I think we can only stress the points we already make in the paper.

Ref. Williamson, Charles HK. "Vortex dynamics in the cylinder wake." Annual review of fluid mechanics 28.1 (1996): 477-539 DOI: 10.1146/annurev.fl.28.010196.002401.


Regarding well-posedness—to be well posed, the initial-boundary value problem must have a unique solution and depend continuously on the data. With appropriate boundary conditions on pressure at infinity, the Navier-Stokes equations are well posed. (But, citing Nordstrom & Svard (2005), SIAM J Num Anal 43(3):1231: "…the exact form of the boundary conditions that lead to a well-posed problem is still an open question …")

It sounds like the referee may be hinting at a sudden change in behavior at a given Reynolds number. The part where a well-posed problem depends continuously on the data also means that "small" changes in the parameters cannot result in "large" changes in the solution.

But I still think it is not in the scope of this paper to engage in a discussion about this. We computed the solutions at two values of Re: 1000 and 2000. We can't speculate about a bifurcation at a value of Re, nor embark on hundreds more simulations to look at intermediate Reynolds numbers. The goal was to replicate previous results, and we did.


[UPDATES]

mesnardo commented 7 years ago

I agree with you on the discussion about the stability of the wake; it would be way to long to explain that and side-tracking from the message of the paper. Also, it would be interesting to cite the work from Williamson (1996) or Karniadakis and Triantafyllou (1992) when we talk about the flow being highly unsteady.

labarba commented 7 years ago

Another thing about the "how reproducible are results intrinsically?" comment: WE DO, in fact, replicate the results (with OpenFOAM and IBAMR) eventually, so I think we've shown that there is no such intrinsic lack of reproducibility. It's just hard!

labarba commented 7 years ago

Addressing reviewer comment # 2:

I do not see any evidence of mesh convergence being conducted as part of the study. This seems to be a rather major and troubling shortcoming. We are given no evidence of whether any solution is remotely mesh independent or the degree of numerical error present in the solution. This is a major component of the proper practice in numerical modeling and simulation in this modern age all too often neglected in published works.

In this work, we are attempting to replicate the findings of a previous CFD paper (Krishnan et al., 2014), so three points to potentially make in the response: (1) evidence of grid convergence having been done in the previous work, Krishnan et al.; (2) since this is a replication study, we use the same mesh as in the previous result, i.e., the decision to use a mesh resolution is dictated by that criterion, rather than new mesh-convergence analysis; (3) despite the previous point, a posteriori mesh convergence can help explain when there are differences in the results.

Answer to (1) is that Krishnan et al. report mesh independence: differences in the average lift coefficients are in the order of 2% at 35 deg AoA, and <0.1% at 30 deg.

PetIBM is the same method exactly as cuIBM, so the same mesh is used. The only algorithmic difference between the two codes is in the linear solver, so we checked that reducing the solver tolerance did not affect the solution (it did not).

With IBAMR, on the contrary, we need mesh convergence to be able to say that it’s meaningful to compare with the results of other codes. We refined the mesh with IBAMR, and the results are self-consistent, at least when looking at our chosen diagnostic, i.e., the average lift coefficient in a specified time range. Some differences, however, are apparent on the time signature of lift. With the finer mesh, we also decreased the solver tolerance, to make sure that discretization error is dominating and algebraic errors are not polluting the results. The results were also consistent when using a larger physical domain. — We’re confident saying that the simulations with IBAMR are mesh-converged.

Another point is that we used CFL 0.3, which is the same value used in the IBAMR publication (Bhalla et al., 2013) with Re~5000. It seemed natural to use this as guidance—and we did in the end replicate the cuIBM results to our satisfaction. But, when reducing CFL to 0.1, there is a difference in the results with IBMAR, in that the observed drop in lift does not occur. This brings the results even closer to the ones of Krishnan et al. We will include this in the revised paper.

@mesnardo : What can we say about the OpenFOAM runs?


[UPDATES]

mesnardo commented 7 years ago

With PetIBM-0.1.1, the manuscript reports the numerical solution obtained with a relative tolerance of 1.0E-05 for each iterative solver. We tried to use more stringent relative tolerances; we decreased the value from 1.0E-05 to 1.0E-08. We averaged the lift and drag coefficients between 32 and 64 time-units of flow-simulation (as it was done in Krishnan et al.) but did not notice significant changes:

Cd Cl
rtol = 1.0E-05 1.2351 1.9936
rtol = 1.0E-08 1.2354 (+0.02%) 1.9914 (-0.11%)

The figure below displays the instantaneous force coefficients: forcecoefficientsre2000aoa35compareloosertol

mesnardo commented 7 years ago

With IBAMR, we refined the mesh (halving the smallest grid-spacing, from 0.004 to 0.002) while keeping the other input parameters identical (including the default relative tolerance of 1.0E-06 for the iterative solvers).

The time-averaged lift and drag coefficients are reported in the table below:

Cd Cl
grid-spacing: 0.004 1.2558 2.0580
grid-spacing: 0.002 1.1850 (-5.6%) 1.9318 (-6.1%)

The lift and drag signatures are not exactly the same but exhibit the same behavior: a decrease in the mean coefficients after 30 time-units of flow-simulation: forcecoefficientsre2000aoa35comparemesh

We also tried to use more stringent exit criterion for the iterative solvers. We decreased the relative tolerance from 1.0E-06 to 1.0E-10 for each solver and obtained the following time-averaged coefficients:

Cd Cl
rtol = 1.0E-06 1.2558 2.0580
rtol = 1.0E-10 1.3015 (+3.6%) 2.1575 (+4.8%)

Again, the instantaneous forces acting on the bluff-body are not identical but share common features: forcecoefficientsre2000aoa35comparedefaultrtol

We also tried to increase the dimensions of the domain. The manuscript reports simulations using a [-15,15]x[-15,15] domain; the following results were obtained on a larger domain: [-30,45]x[-30,30]. Again, we averaged the force coefficients between 32 and 64 time-units:

Cd Cl
[-15,15]x[-15,15] 1.2558 2.0580
[-30,45]x[-30,30] 1.2009 (-4.4%) 1.9564 (-4.9%)

Although the instantaneous forces are not identical, we observe a similar drop in the mean values: forcecoefficientsre2000aoa35comparesmallerdomain

mesnardo commented 7 years ago

Although we were able to replicate, with IBAMR, the main finding of Krishnan et al. (2014), we would like to provide an additional result concerning the flow at Reynolds number 2000 when the bluff-body adopts a 35-degree angle-of-attack.

The manuscript from Bhalla et al. (2013) details the mathematical formulation implemented in the IBAMR library. With IBAMR, we use a dynamic time-increment to advance in time based on a prescribed convective CFL constraint. In their manuscript, Bhalla and co-workers mentioned that, otherwise noted, they used a CFL constraint of 0.3 in the examples reported. One of their simulations concerns the two-dimensional flow around a fixed and rigid circular cylinder at Reynolds number 200. As no further information about the CFL constraint is provided, we assumed that they used a value of 0.3. Another example deals with the locomotion of a two-dimensional eel at Reynolds number 5609. Again, we could not find an updated value for the CFL constraint, thus guessed that a value of 0.3 was maintained to integrate the solution in time.

The Reynolds number of the eel example is of the same order of magnitude than the one in our flying-snake application; thus, we decided to use the same value for the CFL constraint. We obtained good agreement in the time-averaged force coefficients between the simulations using IBAMR and the ones reported in Krishnan et al. (2014), especially at angles-of-attack 25 and 30 degrees. Even if the coefficient at angle-of-attack 35 degrees is slightly smaller (-4.1%) than the one reported in Krishnan et al. (2014), the lift-curve exhibits a spike. Thus, we consider that we replicated with IBAMR the main finding of Krishnan et al. (2014).

However, digging into the input files of the eel application of Bhalla and co-workers (available on the GitHub repository of IBAMR), we saw that they used a CFL constraint of 0.1 (not 0.3). Meanwhile, we also tried to use this CFL value to check the temporal convergence of our results. Reducing the CFL constraint, we do not observe anymore a drop in the mean lift coefficient (as we did with a higher CFL), getting closer to the signature obtained with cuIBM by Krishnan and co-workers.

ibamr_forcecoefficientsre2000aoa35

ibamr_forcecoefficientsvsaoa

When averaging the force coefficients between 32 and 64 time-units of flow-simulation, we get:

Cd Cl
Krishnan et al. (2014) 1.3162 2.1471
IBAMR (CFL=0.3) 1.2558 (-4.6%) 2.0580 (-4.1%)
IBAMR (CFL=0.1) 1.3282 (+0.9%) 2.2285 (+3.8%)

This new result does not modify the overall conclusions about IBAMR: we have been able to replicate the lift-enhancement reported in Krishnan et al. (2014). However, it is important to mention that we obtained even closer time-averaged coefficients when reducing the CFL constraint--we will include that in the revised manuscript.

References:


[UPDATES]

mesnardo commented 7 years ago

Addressing reviewer comment # 14

Computations actually began in the 1940’s in Los Alamos (during WWII). The first punched cards were used for the data flow of the calculation long before the code itself was on cards (details can be found on the LANL web site). Tape actually preceded cards for the written program, and the earlier machines were programmed by rewiring the computer itself with cable plugs (a couple of Los Alamos reports from the late-1940’s contain wiring diagrams).

In the version of our manuscript, we mentioned that the origins of CFD in the Los Alamos Laboratory took place in the 1950's. As pointed out by the reviewer, the origins of CFD date back to the 1940's. The pleasant-to-read and informative paper by Metropolis and Nelson (1982) relates the transition from hand-computing to punched-card computation, and then electronic computing. The Los Alamos laboratory ordered its first punched-card machines from IBM. During war-time, the research conducted was so sensitive that IBM was not allowed to send a crew to install the machines at the laboratory: "Through our army connections, we asked IBM for the name of its best maintenance man drafted into the U.S. Army; that man, John Johnston, was requisitioned." At the beginning, the boxes of punched cards were difficult to "share" not only because of their largeness, but also because everything was kept secret. Note that the punched-card machines were used for implosion simulation that required integrating a two-dimensional partial differential equation.

In our manuscript, we have to correct the timing in the sentence related to the beginning of CFD at the Los Alamos Laboratory. We should also cite the paper from Metropolis and Nelson.

References:


[UPDATES]

mesnardo commented 7 years ago

About reviewer comment # 11:

The outflow boundary conditions issues were explored at length in the 1990’s by Phil Gresho and other researchers and documented in the Int. J Num. Meth. Fluids. It would seem to be a rather obvious source for gleaning greater insight on what is happening in this study.

Gresho and Sani (1987) focus on the pressure boundary conditions when using the pressure Poisson equation to satisfy the divergence-constraint of the velocity field. They limit their study to cases with Dirichlet conditions for the velocity. with prescribed Dirichlet boundary conditions (which is not valid at an open boundary where the fluid flows through the boundary). A pressure Poisson equation is derived by applying the divergence operator to the momentum equation and assuming that the velocity satisfies the divergence-free constraint. The boundary condition for the pressure Poisson equation are derived projecting the momentum equation, at the boundary, in either the normal or tangential direction. The momentum equation is a vector equation while a scalar boundary condition for the pressure is required, therefore one can choose between the normal or tangential projection.

A normal projection of the momentum equation at the boundary gives rise to a Neumann problem for the pressure. The tangential projection at the boundary leads to a Dirichlet boundary condition. Gresho and Sani demonstrate that if Neumann boundary conditions applied to the pressure Poisson equation, the solution will also satisfy the Dirichlet condition. They also mention that for problems with sufficiently large Reynolds numbers and Dirichlet boundary conditions on the velocity, the Neumann boundary condition for the pressure can be approximated by a zero-gradient condition. In conclusion of this paper, the authors say: "we and others often have difficulty with flow-through domains and many have gotten into trouble when numerical time integration is attempted."

Sani and Gresho (1994) report the results from two minisymposia on open boundary conditions (OBCs) for incompressible flows. As recalled by the authors, the boundary conditions for the pressure Poisson equation are usually derived from the momentum equation using the continuity equation in the case where the velocity (or its acceleration) at the boundary is known, which is not the case at an open boundary. Thus, the mysterious pressure boundary conditions must be obtained by some other ways. "While we are not able to state the 'best' OBCs, if indeed such exist, we can list some qualities that they would display: they would permit both the flow and anything it carries to exit the domain gracefully and passively and not have any effect on the behaviour of the solution in the domain near the open boundary (and especially far from it); they would be transparent; they would lead to the same solution inside the common domain no matter where truncation occurred."

The authors mention three different open boundary conditions (for all quantities except the normal component of the velocity). According to them, an homogeneous Dirichlet condition is a poor choice while an homogeneous Neumann condition is a good one; the advection equation at the boundary with the convection speed is user-specified seems to be "gaining in popularity" over the Neumann one.

Later in the manuscript, Sani and Gresho reports the results from different groups using various open boundary conditions for laminar problems. Quote from the conclusion: "We have made some attempts at shedding more light on the difficult and unresolved area of seeking good OBCs for incompressible flow simulations. It has been an exercise in frustration and we are not thrilled with the results obtained, even though they may still be useful to some researchers; thus we pass the baton."

We should definitely cite the paper from Sani and Gresho (1994) reporting on open boundary conditions for incompressible flows.

References:

mesnardo commented 7 years ago

About the mesh convergence with OpenFOAM:

In our manuscript, we report the numerical solution obtained with icoFoam on a two-dimensional mesh generated with snappyHexMesh (one of the mesh utilities of OpenFOAM). The base uniform mesh contains 120x120 cells in a 30cx30c domain (the snake cross-section is centered in it and c denotes the chord-length of the body) and we refined it in the wake region ([-2,15]x[-4,4]) with a characteristic cell-width of 0.16c. A further refinement (with characteristic cell-width of 0.004c) was added in the near-wake region ([-1, 10]x[-2,2]). We generated a mesh that contains about 3.4M cells.

To check the convergence of the mesh, we added a further refinement (characteristic cell-width of 0.002c) in the vicinity of body ([-1,4]x[-1,1]). The finer mesh contains about 5.3M cells. We computed the solution over 80 non-dimensional time-units.

Krishnan et al. (2014) reported mesh convergence with the relative difference in the time-averaged force coefficients when refining the mesh; we do the same with OpenFOAM.

The relative differences in the time-averaged force coefficients is negligible compared to those reported in the manuscript:

Cd Cl
coarse mesh (~3.4M cells) 1.2306 2.0803
finer mesh (~5.3M cells) 1.2359 (+0.43%) 2.0868 (+0.31%)

Although, the time-averaged values are close, the instantaneous force coefficients exhibit some discrepancies, especially after the drop in the mean values: forcecoefficientsre2000aoa35comparecoarsergrid

mesnardo commented 7 years ago

We report in the Story 1 OpenFOAM simulations (snappyHexMesh/icoFoam) where the velocity and pressure iterative solvers have an exit criterion based on a tolerance set to 1.0E-06.

We also ran a simulation (Re=2000 and AoA=35deg) where the tolerance has been lowered to 1.0E-08. The instantaneous force coefficients are indistinguishable from the force coefficients reported in the manuscript. forcecoefficientsre2000aoa35compareloosetol

The time-averaged lift and drag coefficients are reported in the table below.

Cd Cl
rtol = 1.0E-06 1.2306 2.0803
rtol = 1.0E-08 1.2315 (+0.07%) 2.0814 (-0.05%)

In the incompressible Navier-Stokes system, the continuity equation imposes a scalar constraint on the momentum equation. By applying the divergence operator on the momentum equation and using the continuity constraint, one can derive a pressure Poisson equation. IcoFoam is the transient solver of OpenFOAM for incompressible and laminar flows of Newtonian fluids. It uses a pressure-implicit split-operator (PISO) algorithm (Issa, 1986), which consists in solving the momentum equation to find a predicted velocity field (that in general does not satisfy the incompressible constraint) and then, applying a given number of pressure correctors (each correction requires the solution of a pressure Poisson system) to ensure mass conservation. In the icoFoam simulations reported in the manuscript, we performed two pressure corrections.

References:

mesnardo commented 7 years ago

Addressing reviewer comment # 6:

I wouldn’t use the term “discretization meshes” but rather a “mesh to use for discretization”.

mesnardo commented 7 years ago

Addressing reviewer comment # 16:

The discussion around the failings of the published literature is quite good. It might be buoyed by some examples where the publication of negative results have had a disproportionately positive impact on the community. A couple of good examples are the paper by Quirk or Ioannidis included below.

  • We added a reference to Quirk (A Contribution to the Great Riemann Solver Debate, 1997). See change history.
labarba commented 7 years ago

Reviewer comment # 3:

There is not any significant investigation of the uncertainty from readily identifiable sources as part of the work.

From the lecture slides of Tim Barth (NASA Ames), the definition of uncertainty is: How accurately does a mathematical model describe the true physics and what is the impact of model uncertainty (structural or parametric) on outputs from the model?

With that in mind, there is at least a part of UQ that is outside the scope of our paper, and that is any assertion regarding how close our simulations are to the physics. In Krishnan et al. 2014, we cared about whether we got lift enhancement at a given angle of attack, which had been seen in water-tunnel experiments. But in this paper, we are trying to replicate the previous computational findings, and we are not concerned right now with how close they are to “the true physics." The other aspect of UQ has to do with variability in the parameters of the model. For example, if we were using a turbulence model, we would want to look at how the results are affected by variations in the value of kappa or epsilon or whatever. This also does not apply to our paper.

Note that uncertainty is different from error. AIAA defines error as: "A recognizable deficiency in any phase or activity of modeling and simulation that is not due to lack of knowledge." Oberkampf et al. (2000) divide errors into acknowledged (round-off, discretization, simplified model) and unacknowledged (programming errors, code usage mistakes).

Uncertainty also has two categories: Epistemic uncertainty: A potential deficiency in any phase or activity of the modeling process that is due to the lack of knowledge. Aleatory uncertainty: The physical variation present in the system being analyzed or its environment.

Examples of aleatory uncertainty: variability in boundary conditions and geometrical description of the model, for example due to manufacturing processes. So it’s characteristic of the physical system being studied.

I can see that UQ can be an added value when performing validation.

One method to study uncertainty in CFD is the so-called grid-convergence index, GCI (Roache, 1997). This entails obtaining the solution on three consecutively refined grids, and applying Richardson extrapolation to estimate the numerical error. This method is straightforward with a uniform grid—but how we could use it in the case of unstructured grids (IcoFOAM), or adaptive Cartesian grid (IBAMR), or even stretched Cartesian (cuIBM, PetIBM)? For non-uniform grids, Roache suggests that one can get an "effective grid-refinement ratio" using the number of grid points: (N_1/N_2)^(1/d), where d is the dimension of the problem. But this assumes that one has a way to systematically refine the grid (e.g., maintaining grid quality). There's no way to do systematic grid refinement if the solver uses AMR. It may be possible in the case of stretched Cartesian grids. If we attempted this (systematically refine and obtain solutions with three consecutively finer grids in the asymptotic region), we could get a GCI, and we could say that the lift coefficient is calculated with an error band of x%. (With cuIBM, we have the problem that a finer mesh does not fit in the memory of one GPU card. We'd have to do coarsening, in that case.)

It appears that the community is divided on whether we should consider numerical error as uncertainty. We found that Oberkampf (who was staff researcher at Sandia in the Validation and Uncertainty Estimation Department) says that it is flawed to deal with numerical solution error in the same manner as experimental error, because "Numerical errors such as spatial and time-step discretization […] are not related to probability …" (Oberkampf, 2000)


Another aspect of this referee comment that I want to throw here: the great majority of CFD papers don't do any uncertainty quantification. In fact, most don't even estimate the errors!

We made the following "study" : scraping text from 10-years worth of papers from J. Comp. Phys. (between 2000 and 2010), how many mention "Richardson extrapolation" at least once in their paper? Our result is 54 out of 4,242 total papers. Searching for "uncertainty quantification" gives 30 papers… less than 1% of papers. (There could still be bugs in our text-scraping script.)

Not that this is an excuse for totally ignoring the question of UQ, but it's a very difficult problem. Certainly not something we can do as part of a minor revision. I'm not sure we could do it as part of a whole new paper, in fact!


Ref.

mesnardo commented 7 years ago

Addressing comment # 12:

The linear algebra episode is a bit troubling and would seem to indicate some rather fundamental issues with the solver discretization. There may be a discrete violation of the Fredholm integral that renders the CG solver problematic. This might be a side effect of the discrete system being corrupted rather than a shortcoming of the solver itself. For properly formed symmetric systems, CG is vastly more robust and cost effective than BiCGstab!

When we looked into this problem, we sought help on the PETSc mailing-list. Barry Smith reported they have heard of this problem before: the GAMG algorithm may generate a preconditioner with both positive and negative eigenvalued. We do not yet fully understand this problem (we have an open issue in the PetIBM repository about this). We ended up following Barry Smith's advice to switch to biCGStab.

mesnardo commented 7 years ago

Addressing comment # 13:

The whole episode with the libraries seems to indicate that the problem you are solving is itself unstable. This is another place where the lack of convergence testing and fundamental UQ may have a serious impact on the story. What level of fundamental physically reasonable variability can be expected in this problem? I suspect the degree of variability is actually quite large and therefore the problem may not lend itself to strict reproducibility. Instead there is a probabilistic framework for reproducibility that may make far more sense in the long run. The thing to reproduce may be statistical properties of the solution rather than a well-posed and unique initial value problem.

We have addressed in several places that the flow problem is subject to instabilities. We have also addressed confirming grid independence with the various codes. (Admittedly, this is not systematic grid-convergence analysis.) Quantifying the physical variability in this problem would entail a campaign of experiments, with the appropriate statistical strength. Then, comparing with this experiment would mean running fully 3D simulations, which is what we are aiming for. The 2D simulations already take about 2 days each (on one GPU or on 32 CPU cores). A statistical approach to reproducibility, in this case, sounds computationally very expensive. Also, despite all the problems we encountered, we do in fact replicate the previous findings, in most cases.

labarba commented 7 years ago

Addressing comment # 11:

The outflow boundary conditions issues were explored at length in the 1990’s by Phil Gresho and other researchers and documented in the Int. J Num. Meth. Fluids. It would seem to be a rather obvious source for gleaning greater insight on what is happening in this study.

Response:

We looked into the work by Gresho and Sani (1994), and found that they don't give us more insight. In fact, they reported that they were unhappy with the results from the workshop on open boundary conditions. They call the problem of choosing the best outflow boundary conditions "difficult and unresolved" and "an exercise in frustration."

Summary of the papers in another comment above.


Changes made in commit fadc4f3 add a comment and reference to Gresho and Sani, 1994.

mesnardo commented 7 years ago

Addressing comments 3, 4, 8, and 10:

These referee comments all bring up questions regarding uncertainty quantification and grid convergence.

The response in the comment above mentions one possible way to quantify uncertainty: the grid-convergence index of Roache (1997). We attempted to apply this technique with cuIBM using the numerical solution on three systematically refined grids. Unfortunately, we encountered a few challenges. First, the quantity of interest in our study is the time-averaged lift coefficient (between 32 and 64 time units). For the medium mesh (the one reported in the manuscript), this requires 160,000 time steps (two days of run time on a K20). On a finer mesh, we need to reduce the time increment and tighten the iterative tolerances. After a few tries with different parameters, we had to give up on obtaining this fine-grid solution in a reasonable time frame. In other situations, one can complete a grid-convergence study with shorter simulations, but for this unsteady problem, we need several periods of shedding to calculate an average. It turned out to be too expensive to complete during this revision. Instead, we provide evidence of grid independence on the same grids as Krishnan et al. (2014) did. We supplement the manuscript with additional Jupyter notebooks reporting on our efforts to assess independence with respect to grid spacing, time increment, and iterative tolerance. We investigated these issues with each software.

Given space constraints for the manuscript, we point the readers to the supplementary materials in reference to these issues. (See change in revised version)