Exawind / amr-wind

AMReX-based structured wind solver
https://exawind.github.io/amr-wind
Other
103 stars 78 forks source link

MAC velocity not getting filled correctly at the low boundary #1076

Closed mukul1992 closed 3 weeks ago

mukul1992 commented 1 month ago

Bug description

I am using u_mac (MAC velocity in x-direction) as an example. The index i=0 lies at the boundary for u_mac, but does not get filled by fillpatch_sibling_fields called from preadvect in icns_advection.H.

Steps to reproduce

Steps to reproduce the behavior:

  1. Compiler used
    • [x] GCC
    • [ ] LLVM
    • [ ] oneapi (Intel)
    • [ ] nvcc (NVIDIA)
    • [ ] rocm (AMD)
    • [ ] with MPI
    • [ ] other:
  2. Operating system
    • [ ] Linux
    • [x] OSX
    • [ ] Windows
    • [ ] other (do tell ;)):
  3. Hardware:
    • [x] CPU
    • [ ] GPU
  4. Machine details: Apple M3 laptop
  5. Input file: rankine.inp

I am commenting out my originally reported test and have added a new test in the next comment below.

AMR-Wind information

==============================================================================
                AMR-Wind (https://github.com/exawind/amr-wind)

  AMR-Wind version :: d6d42205
  AMR-Wind Git SHA :: d6d422058c48cf378117e1ab61c3c5d499ac5a2a
  AMReX version    :: 24.05-20-g5d02c6480a0d

  Exec. time       :: Fri May 24 12:09:29 2024
  Build time       :: May 24 2024 09:19:32
  C++ compiler     :: GNU 13.2.0

  MPI              :: ON    (Num. ranks = 1)
  GPU              :: OFF
  OpenMP           :: OFF

  No additional third-party libraries enabled

           This software is released under the BSD 3-clause license.
 See https://github.com/Exawind/amr-wind/blob/development/LICENSE for details.
------------------------------------------------------------------------------

Additional context

Apart from the fact that technically i=0 lies at the boundary and hence should be filled, the main issue arises for me while implementing the inflow-outflow BC which can have cells with Dirichlet treatment on both the low and high boundaries. The MAC velocity at the high boundary (e.g. u_mac at the xhi boundary) does get filled in the above function and this creates an unexpected asymmetry. For example, if there are 16 cells in the x-direction, i=0 and i=16 would lie on the low and high boundaries respectively. i=0 does not get filled, but i=16 does get filled, hence breaking the symmetry.

Also, I think this issue would only surface for a time-varying boundary condition.

My guess is that the nodal nature of MAC-velocities (in the respective directions) is not accounted for in the backend fillpatch routines and hence the index i=0 is not filled as it lies inside the domain for a cell-centered variable.

mukul1992 commented 1 month ago

I change the rankine.inp file to prescribe mass_inflow at xhi too, just for checking how it acts:

xlo.type = "mass_inflow"
xlo.density = 1.0
xlo.velocity.inflow_type = Rankine
xlo.temperature = 300.0
#xhi.type = "pressure_outflow"
xhi.type = "mass_inflow"
xhi.density = 1.0
xhi.velocity.inflow_type = Rankine
xhi.temperature = 300.0

I insert explicit checks within the Rankine.H UDF operator():

            amrex::IntVect ivlo(AMREX_D_DECL(0, 12, 1));
            amrex::IntVect ivhi(AMREX_D_DECL(40, 12, 1));
            if (iv == ivlo) amrex::Print() << "filling " << vel[orig_comp + comp] << " at i=0" << std::endl;
            if (iv == ivhi) amrex::Print() << "filling " << vel[orig_comp + comp] << " at i=40" << std::endl;

The output shows filling the i=40 values but not i=0. The domain has 40 cells in the x-direction and so both of them should be treated similarly as they lie on the boundary:

  MAC_projection                 9         0.01766980429       1.787159233e-14
Fill mac velocities using velocity BCs
filling 10.14358888 at i=40
filling 0.5357916479 at i=40
filling 0 at i=40
  temperature_solve              2       1.221647966e-05        1.98951966e-12
marchdf commented 1 month ago

I can reproduce this. Thank you for posting the detailed information. Did you get far enough to determine a potential fix or reason for this? I will be looking at this later this week.

marchdf commented 1 month ago

One thing that bugs me about this UDF is that it is assuming that it is filling a cell-centered field (e.g. the way it computes x: https://github.com/Exawind/amr-wind/blob/main/amr-wind/physics/udfs/Rankine.H#L36). So there will be a difference when you look at the value it puts there. It looks like it is filling -1 and 40:

filling 10.73181747 at i=-1
filling 1.016180833 at i=-1
filling 0 at i=-1
filling 10.14358888 at i=40
filling 0.5357916479 at i=40
filling 0 at i=40

It's not yet clear to me why it does that. Ultimately this is being called from amrex::FillPatchSingleLevel with nghost = 1 1 1, a face centered field (the mac vel), and a physbc we give it. AMReX handles the functions calls between that when we drop into physbc (our UDF). Maybe we are using this call wrong? Maybe this all makes sense to someone else but I am still digging around.

Here's some useful context for all this (when I first had a go at this madness):

Given the number of PRs/commits/issues etc that I went through when dealing with this stuff, I clearly lost my mind (at the time for sure, and probably still now).

mukul1992 commented 1 month ago

One thing that bugs me about this UDF is that it is assuming that it is filling a cell-centered field

Yes, that was going to be my next concern after making sure that the correct indices are being filled. I was wondering if there is a way to pass in the IndexType of the velocity being filled in the custom velocity UDF so that the coordinates can be calculated correctly depending upon whether it is a cell-centered or face-centered velocity.

It looks like it is filling -1 and 40

Right, it fills only -1 on the lower boundary and fills both 40 and 41 on the higher boundary. But my understanding is that it should also fill 0 because that lies at the wall like 40.

Thank you for providing the additional context in terms of past work and how we seem to doing the correct thing, I have not dug around further today, I will try to make sense of it tomorrow, though it may take more time for me as I'm less familiar with this part of AW/amrex.

marchdf commented 1 month ago

Passing in IndexType would require changes in amrex because that's where the function signature/call for amrex::PhysBCFunct<amrex::GpuBndryFuncFab<Functor>> physbc is happening...

mukul1992 commented 1 month ago

Okay, I think it is better to consider that feature later since it is not immediately relevant, and to focus on why i=0 is not being filled.

mbkuhn commented 1 month ago

On that topic, I think that the unexpected behavior is not that i=0 is not being filled; rather, it's that i=40 is being filled. The boundary functions are meant to fill ghost cells, and i=0 is not a ghost cell; it's a valid cell. For a quantity on an x-face, i=40 is a valid cell, yet it's being filled. This is checked fairly easily by doing setBndry() for umac -- that operation does not touch either i=0 or i=40 values, but it does modify i=-1 and i=41.

I can see that you might want to change these valid boundary cells for the sake of your new BC, which is fine. But my understanding of the AMR-Wind implementation is to fill the ghost cells of the cell-centered velocity with correct values, and then the MAC velocities at the boundaries are calculated from those cell-centered velocities. fillpatch_sibling_fields is meant to populate the ghost cells of the MAC velocities, but not the ones at the valid boundaries.

Marc and I are getting close to understanding what needs to be done to get the correct behavior.

mukul1992 commented 1 month ago

Okay, that makes complete sense to me, thanks for explaining and for working on this. :)

I think the MAC velocities for valid cells should be computed fine with all the other special handling that I have put in, unless I'm missing something. Once we have this issue sorted, I will have to figure out how to apply my custom Neumann fills using the MassInflowOutflowBC operator to the MAC velocity ghost cells.

marchdf commented 1 month ago

100% with Michael here. The curiosity is why i=40 is being filled.

  MAC_projection                 9         0.01766980429       1.791708441e-14
ghost cells here: (1,1,1) for idim: 0
--------------------------- filfcface call starts
Going to look at this tmp box: ((-1,0,0) (0,59,3) (1,0,0)) with b: ((-43,0,0) (0,59,3) (1,0,0)) and bx: ((-1,-1,-1) (41,60,4) (1,0,0))
filling 10.73181747 at i=-1
Going to look at this tmp box: ((0,-1,0) (40,-1,3) (1,0,0)) with b: ((0,-62,0) (40,-1,3) (1,0,0)) and bx: ((-1,-1,-1) (41,60,4) (1,0,0))
Going to look at this tmp box: ((0,0,-1) (40,59,-1) (1,0,0)) with b: ((0,0,-6) (40,59,-1) (1,0,0)) and bx: ((-1,-1,-1) (41,60,4) (1,0,0))
Going to look at this tmp box: ((40,0,0) (41,59,3) (1,0,0)) with b: ((40,0,0) (83,59,3) (1,0,0)) and bx: ((-1,-1,-1) (41,60,4) (1,0,0))
filling 10.14358888 at i=40
filling 10.13957005 at i=41
Going to look at this tmp box: ((0,60,0) (40,60,3) (1,0,0)) with b: ((0,60,0) (40,121,3) (1,0,0)) and bx: ((-1,-1,-1) (41,60,4) (1,0,0))
Going to look at this tmp box: ((0,0,4) (40,59,4) (1,0,0)) with b: ((0,0,4) (40,59,9) (1,0,0)) and bx: ((-1,-1,-1) (41,60,4) (1,0,0))
--------------------------- filfcface call ends

the Box tmp = b & bx; in AMReX_PhysBCFunct.H is the curious operation. For a face centered field with 1 ghost cell, we would expect i=-1 and i=41 to be filled. But the & is also including i=40. Really we would want extra checks in the UDF function: if bx.ixType() is a face and i > ihi (+1?) then fill it. You can see the fancy footwork (million if conditions) that are being done in FilfcFace in AMReX_FilFC_3D_C.H. We would want to mimic those I think. But we don't have access to bx or the ixType so we can't even begin to use those... not sure the solution to that. In any event, it doesn't much matter because the mac vel ghost cells are not needed (other than for BDS and multilevel Godunov).

All that being said. If you want i=0 or i=40 for the mac vel filled, you are going to need a different strategy... Right now I am not sure what it looks like. Other than an explicit new function call that loops on those faces you want to fill. Maybe right after that fillpatch call?

asalmgren commented 1 month ago

Just a comment re what mkuhn said above -- even when we fill ghost cells for cell-centered velocity there is an "understanding" in amrex that the value in that ghost cell actually represents the value on the face (even though we store it as a ghost "cell"). I wasn't sure if what you wrote above agreed or disagreed with that so just thought I'd put that here to be sure.

mukul1992 commented 1 month ago

If you want i=0 or i=40 for the mac vel filled, you are going to need a different strategy... Right now I am not sure what it looks like. Other than an explicit new function call that loops on those faces you want to fill.

At this point, I just want symmetric behavior between the low and high boundaries. So if i=40 not being filled is the correct behavior numerically then I'd be good to go.

marchdf commented 1 month ago

@asalmgren yup that's understood. Though I bet amr-wind does this inconsistently (this Rankine UDF is one of those that doesn't do that bit right I think). Going through all the UDFs that have been added in the past and making sure that they fit this assumption needs to be done. But I think it is somewhat tangential to this issue right now.

I pinged Weiqun to see if he has ideas for the i=40 index. These indices and the function calls come to us from amrex so we are a bit boxed in to what we can do/have access to.

asalmgren commented 1 month ago

I've gone back through this thread more carefully -- I think the missing piece here (and I'm 90% sure but not 100% sure that everything I say below is correct) is that the physbc is defined by the user and has to be appropriate for the quantity being filled. So it's not amrex's job to move data to the right place, it's the user's job when providing a physbc to make sure that physbc matches the data type. In other words, you shouldn't be using the same physbc to do a fillpatch on the cell-centered velocity as on the mac velocity

marchdf commented 1 month ago

Yes I think that's about right and the same conclusion I reached last night. It would be good to achieve this without a ton of duplicate code in amr-wind. Going to have to think on this one...

My immediate concern though is with this:

Going to look at this tmp box: ((-1,0,0) (0,59,3) (1,0,0)) with b: ((-43,0,0) (0,59,3) (1,0,0)) and bx: ((-1,-1,-1) (41,60,4) (1,0,0))
filling 10.73181747 at i=-1

This box ((-1,0,0) (0,59,3) (1,0,0)) contains 0 so why does it say that it is only filling -1? I think amr-wind must be intercepting something?

mbkuhn commented 1 month ago

You called it, Marc. I confirmed the obvious, that indices with i = 0 are getting passed to f_user(). However, f_user is not what goes to the Rankine.H operation directly. First, it has to go through the Functor that initialized the GpuBndryFuncFab, which, for this case, is "DirichletOp" in FieldBCOps.H in AMR-Wind. This operator is what checks whether the field at the index should be filled or not, using a bool "is_bndry". This check assumes a cell-centered approach, which is why i = 0 is ignored but i = 40 isn't. After this check, either inflow_op (which would be the UDF in this case) or wall_op is called.

marchdf commented 1 month ago

yes! I just found the same. The offending line is https://github.com/Exawind/amr-wind/blob/main/amr-wind/core/FieldBCOps.H#L172. This is fine for cell-centered, not fine for face-centered. Phew. Now to think of the right way to fix.

marchdf commented 1 month ago

Right now my thinking is this: In FieldFillPatchOps.H:

Functor bc_functor() { return m_op(); }
Functor face_bc_functor() { return m_op.face(); }.  ----> add this and then use `face_bc_functor` in fillpatch_sibling_fields

then in FieldBCOps:

inline FaceFunctorType face() const
    {
        return FunctorType{
            m_field, m_inflow_op.device_instance_face(),
            m_wall_op.device_instance()};
    }

Though we might want using FaceFunctorType = DirichletOpFace<InflowOpType, WallOpType>; to get the right () operator that does the right is_bndry.

then in all the UDFs:

struct DeviceOpFace 2 refs|1 var
{
... something that looks like DeviceOp. Maybe it inherits it so that it is similar? Would need to make sure the members are initialized properly too and are the same as DeviceOp.
}

DeviceType device_instance_face() const { return m_op_face; }

I need to sit on this and think this through before coding anything. I am sharing this in case others have thoughts/see ways to simplify.

marchdf commented 3 weeks ago

Addressed in #1093