libMesh / libmesh

libMesh github repository
http://libmesh.github.io
GNU Lesser General Public License v2.1
659 stars 286 forks source link

Automatic Differentiation for geometric entities? #2121

Open lindsayad opened 5 years ago

lindsayad commented 5 years ago

So to run moving mesh problems with perfect Jacobians in MOOSE I've developed methods like Assembly::computeSinglePointMapAD that essentially reproduces FEMap::compute_single_point_map but makes changes like:

      for (auto i : index_range(elem_nodes))
      {
        libmesh_assert(elem_nodes[i]);
        libMesh::VectorValue<DualReal> elem_point = *elem_nodes[i];
        unsigned dimension = 0;
        for (const auto & disp_num : _displacements)
          elem_point(dimension++)
              .derivatives()
              .insert(disp_num * _sys.getMaxVarNDofsPerElem() + i) = 1.;

        _ad_dxyzdxi_map[p].add_scaled(elem_point, dphidxi_map[i][p]);

        if (_calculate_xyz)
          _ad_q_points[p].add_scaled(elem_point, phi_map[i][p]);
      }

Doing this allows me to propagate the dependence on non-linear displacement variables down into quantities like JxW, normals, curvatures, and shape function gradients (and consequently variable solution gradients). I'm pained, however, by the amount of code I've duplicated. By dream would be to be able to do these calculations within libMesh::FE objects.

Realizing that dream would likely mean an overhaul of the FE object. In MOOSE I've introduced an enum ComputeStage that can take on RESIDUAL and JACOBIAN values. AD (automatic differentiation) objects are then templated on this enum so that their data members essentially take on Real and DualNumber character for residual and jacobian objects respectively. I imagine that something similar would have to be done to FE: introduction of another template parameter. Such an overhaul could not break existing applications so it might actually mean leaving FE alone (at least its APIs) and creating a new class that could share code (probably through inheritance).

I dunno, do folks thing it's possible to avoid all this code duplication in a reasonable way?

dschwen commented 5 years ago

At first glance it looks like adding a third template parameter with a default value that keeps current behavior should not break anything.

lindsayad commented 5 years ago

So I guess we would have to take the default template parameter all the way down into FEAbstract because that's where a lot of the getters are (JxW etc.). FEMap would also have the default template parameter--obviously it needs one because that's where the mapping calculations are done.

Here's the problem with that...in C++11 this is not valid code:

template <typename T = double>
class A
{
};

int
main()
{
  A a;
}

You would have to use A<> a. So anywhere someone has FEAbstract they would get a compiler error. Note that the above code is valid in C++17.

dschwen commented 5 years ago

C++17 here we come!

lindsayad commented 5 years ago

I would have to look at GRINS, but MOOSE doesn’t actually have a single occurrence of FEAbstract (at least in the framework).

On May 2, 2019, at 5:25 PM, Daniel Schwen notifications@github.com wrote:

C++17 here we come!

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

pbauman commented 5 years ago

I would have to look at GRINS, but MOOSE doesn’t actually have a single occurrence of FEAbstract

GRINS doesn't either.

lindsayad commented 5 years ago

Nor does either repo appear to have any occurrences of FEMap...

lindsayad commented 5 years ago

@roystgnr I'm going to start in working on this because it would be extremely beneficial for me if we had this capability. Please throw up any red flags if you can think of them

roystgnr commented 5 years ago

The usual red flag with changing template signatures is that it breaks forward declarations, but I don't actually see any instances where that's likely to crop up.

I think we may actually be protected here because of the way Paul refactored things when working on vector-valued elements. Most user code that wants a generic handle to an FE object still uses FEBase, but FEBase is currently a typedef to FEGenericBase<Real>, and so we can change that to be a typedef to FEGenericBase<Real,Real> pretty seamlessly.

lindsayad commented 5 years ago

Had a problem sending this to libmesh-devel, so posting here since it's related:

Can I change Point to inherit from VectorValue? To generalize FE and FEMap, I'd like to replace a lot of the occurrences in the FEAbstract interface of Point with something like PointType<RealType>::type where I define PointType like:

template <typename RealType>
struct PointType
{
  typedef VectorValue<RealType> type;
};
template <>
struct PointType<Real>
{
  typedef Point type;
};

I'd like the user who's exercising the generalized capability to have some consistency. E.g. they're used to passing in a std::vector of Points when optionally reinit'ing on custom points. But if they want points to be a std::vector of point like things that may include derivative information, they generally can't use raw TypeVectors (because we've restricted how they can construct them); they have to use VectorValues. So if they need to use VectorValues when calling reinit with their custom point type, then I think it would be good if Point itself is a VectorValue.

roystgnr commented 5 years ago

Can I change Point to inherit from VectorValue?

This feels like a classic Chesterton's Fence situation from my point of view. I aped the TypeVector/VectorValue organization when creating TypeTensor/TensorValue, but I didn't (and still don't) really understand why the previous classes were structured the way they are. So although I can't think of any reason not to change the inheritance for Point, that just makes me assume that @benkirk or @jwpeterson must know something I don't.

benkirk commented 5 years ago

I’m pretty far from the bleeding edge, but since Point was certainly one of the first 3 classes created, its fair to say it is dated and things might be done differently if from scratch.

So if it makes sense, doesn’t increase memory, and maybe can be converted with accessor / interface /proxy methods for some backward compatibility, why not?


On: 03 May 2019 13:06, "roystgnr" notifications@github.com<mailto:notifications@github.com> wrote:

Can I change Point to inherit from VectorValue?

This feels like a classic Chesterton's Fencehttps://urldefense.proofpoint.com/v2/url?u=https-3A__en.wikipedia.org_wiki_G.-5FK.-5FChesterton-23Chesterton-27s-5Ffence&d=DwMCaQ&c=ApwzowJNAKKw3xye91w7BE1XMRKi2LN9kiMk5Csz9Zk&r=cL6XMjnReoBDeWspbxyIhOHmg_O4uY2LnJOBmaiGPkI&m=vjFaPFyymdR8rW1r0lfaCnSc7ouhqT4D1YE5Hci7vzo&s=2KGg2HAcq-t4tQgzgEqRXlAaLcoH_s0f-1K46zpAQMw&e= situation from my point of view. I aped the TypeVector/VectorValue organization when creating TypeTensor/TensorValue, but I didn't (and still don't) really understand why the previous classes were structured the way they are. So although I can't think of any reason not to change the inheritance for Point, that just makes me assume that @benkirkhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_benkirk&d=DwMCaQ&c=ApwzowJNAKKw3xye91w7BE1XMRKi2LN9kiMk5Csz9Zk&r=cL6XMjnReoBDeWspbxyIhOHmg_O4uY2LnJOBmaiGPkI&m=vjFaPFyymdR8rW1r0lfaCnSc7ouhqT4D1YE5Hci7vzo&s=xhAhR5a4yugPBbLPNVQEeqezfVKL1c57UjxbtzgA44s&e= or @jwpetersonhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_jwpeterson&d=DwMCaQ&c=ApwzowJNAKKw3xye91w7BE1XMRKi2LN9kiMk5Csz9Zk&r=cL6XMjnReoBDeWspbxyIhOHmg_O4uY2LnJOBmaiGPkI&m=vjFaPFyymdR8rW1r0lfaCnSc7ouhqT4D1YE5Hci7vzo&s=acS6n8ejdBobVfEGMGPWhyboVhHk1n8BtQGY_3KgUpo&e= must know something I don't.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_libMesh_libmesh_issues_2121-23issuecomment-2D489188023&d=DwMCaQ&c=ApwzowJNAKKw3xye91w7BE1XMRKi2LN9kiMk5Csz9Zk&r=cL6XMjnReoBDeWspbxyIhOHmg_O4uY2LnJOBmaiGPkI&m=vjFaPFyymdR8rW1r0lfaCnSc7ouhqT4D1YE5Hci7vzo&s=yZ9qUNXIXSvIyk2vdosb1OQ1qDevOgUYfeO8zf3p9tM&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AASBY3CIRZBSYGVVKLSUEJLPTR5J5ANCNFSM4HKIE6BQ&d=DwMCaQ&c=ApwzowJNAKKw3xye91w7BE1XMRKi2LN9kiMk5Csz9Zk&r=cL6XMjnReoBDeWspbxyIhOHmg_O4uY2LnJOBmaiGPkI&m=vjFaPFyymdR8rW1r0lfaCnSc7ouhqT4D1YE5Hci7vzo&s=TWkhi-dK8aw971iIApx7xg74QprldOi0NCkqajeo0_g&e=.

lindsayad commented 5 years ago

@roystgnr Would you mind looking at lindsayad/libmesh@cab5ebe0a to see whether this seems like an acceptable direction?

lindsayad commented 4 years ago

We are speculating that the easiest way for us to get flow-through derivative calculations would be to actually typedef Real to be a DualNumber. @rwcarlsen mentions that there are a fair number of instances in libMesh where the code expects Real to be a POD type, but what do folks think about this general idea?

In order for this to work for realistic simulations, the derivative storage type would have to use the heap in order to have acceptable memory usage; we wouldn't want to automatically multiply the storage requirement of every Real by the size of a statically sized array. In order to use the heap, we think we'd need to use a memory pool in order to have tolerable computational performance.

roystgnr commented 4 years ago

Hmm... I'm not sure. Not having to template everything to hell and back sounds easier, but having to make sure every code using Real is now compatible with DualNumber sounds potentially harder. Maybe I'm worrying over nothing, though, if the need for derivatives ends up propagating through practically everything anyway.

the derivative storage type would have to use the heap in order to have acceptable memory usage; we wouldn't want to automatically multiply the storage requirement of every Real by the size of a statically sized array

So here's what scares me that I hadn't thought about before. Not POD assumptions in general: there's probably thousands of places where we expect Real to be POD from an efficiency standpoint, but where the results will still be correct and the efficiency can be fixed easily. But one category of POD assumption in particular: we assume that Real, Point, etc. are fixed-size types for the purposes of TIMPI calls, so that we can use StandardType and not have to pack and unpack them. In my wild dreams there will one day be some kind of SFINAE overloads of the TIMPI methods, that will auto-detect and use packed ranges under the hood whenever necessary, but that day will not be any time soon, and in the meantime there'll be a lot of parallel code that will have to get vastly messier if you want this to work in parallel.

dschwen commented 4 years ago

and the efficiency can be fixed easily

That is a bold assumption. It is not obvious to me that arithmetic with overloaded operators and runtime sized types can be optimized by the compilers as well as plain Real arithmetics. But I guess that's for the profiler to determine...

roystgnr commented 4 years ago

By "the efficiency" here I only mean "the efficiency losses due to straightforward programmer choices made while assuming Real==double", not "the efficiency losses inherent to loop operations, mixed sparsity patterns, and heap storage". By "fixed easily" I then mean only things like "usually pass by const reference instead of by value", not "improve compiler optimization of operator overloads on heterogeneously sparse vectors".

We'll eventually want to work on more complicated optimizations too, of course, but that'll be hard work (How much mileage can we get out of move parameters? Should we 'hash' sparsity patterns as bitmaps so we can optimize away indexing work in the case of operands with identical patterns?) for diminishing returns. I think it's fair to be much more worried about cases where it looks like we'll have to do hard work merely to get things to compile.

lindsayad commented 4 years ago

So here's what scares me that I hadn't thought about before. Not POD assumptions in general: there's probably thousands of places where we expect Real to be POD from an efficiency standpoint, but where the results will still be correct and the efficiency can be fixed easily. But one category of POD assumption in particular: we assume that Real, Point, etc. are fixed-size types for the purposes of TIMPI calls, so that we can use StandardType and not have to pack and unpack them. In my wild dreams there will one day be some kind of SFINAE overloads of the TIMPI methods, that will auto-detect and use packed ranges under the hood whenever necessary, but that day will not be any time soon, and in the meantime there'll be a lot of parallel code that will have to get vastly messier if you want this to work in parallel.

@roystgnr Can you outline some of the changes that would have to be made? I'm looking at parallel_node.C and I can see that we'd have to do something about idtypes_per_Real. My initial inclination there would be to create a templated struct with static const unsigned int data member like id_types_per_T. So that right there is not too tricky. I'm assuming there's much more and trickier business to tackle?

I've been thinking more and wondering at least as a first pass whether instead of templating the whole (geometric) world as in #2396, whether to simply configure a GeomReal type, where most users would typedef Real GeomReal but one could typedef GeomReal to be some AD type. If we did this, we have the parallel packing/unpacking code that we need to change. However, we would also have to do this in #2396 as well unless we decide that all parallel communication with respect to geometry is going to be done with a reference/undisplaced mesh (e.g. a mesh that would never need AD information) that has RealType = Real and any case where a user wants AD information on a displaced mesh (a mesh with RealType = SomeADType), they're just going to use the same partitioning, ghosting, etc. of the undisplaced mesh. This is troublesome because in MOOSE we do geometric searches based on a displaced mesh, and these searches may determine different elements that need to be ghosted than would be determined on an undisplaced mesh.

So I'm thinking that generalization of the parallel code may have to happen whether we proceed via typedef or via templating. I think that the typedef route may be the best first attempt because overall it would be much less disruptive to the code base. It would be slower than the templating route because the user will be doing AD calculations behind the scenes even when on an undisplaced mesh. How much slower may determine whether the templating route in #2396 still needs to happen. Thoughts?

lindsayad commented 4 years ago

Ok, I can see that we'd have to define StandardType<ADType> specializations...

roystgnr commented 4 years ago

If ADType was a fixed-size type with no "deep" members or other sophistication, the sort of thing for whichmemcpy(newfoo, oldfoo, sizeof(Foo)) works, then specializing StandardType<ADType> would actually be pretty easy.

But it's not, right? Or at least, I assume you don't want to tie yourselves down to "sparse derivative arrays have a maximum size fixed at compile time" as the only possible option in parallel?

So what we need, if you want to support dynamically resizeable GeomReal, isn't just Packing<ADType> specializations, it's also some way to seamlessly let every Communicator::foo(T) call instead invoke a Communicator::foo_packed_range(T) equivalent whenever T is a dynamic type.

The dispatch seems like it would be straightforward so long as the call arguments can be made to match up: have a static const bool StandardType<ADType>::is_fixed_type member, and use template specialization to make sure the standard fixed-size-type method is called when is_fixed_type is true and the packed-range method is called when is_fixed_type is false.

But in our current Packing APIs, the call arguments can't be made to match up! I'd invented those for use with moving Node and Elem from processor to processor, and doing that efficiently means directly serializing them from the source processor Mesh and unserializing them into the target processor Mesh, and so we have the "context" and "output iterator" arguments to make that sort of behavior easy. We don't have any equivalent of those arguments for any of the non-packed range APIs, because the destination data structure is generally an array we can just hand to MPI to fill, and when it's not then it's at least some standard container type we pass in.

On the other hand, for a GeomReal we wouldn't be building stuff in a specific Mesh or anything, we'd be just doing a plain heap allocation at worst. So... if we can get by with a void context and a generic output iterator, maybe what we need is for the !is_fixed_type methods to just construct the void context and the generic output iterator themselves, and then we can still specialize Packing<GeomReal> and use the same APIs?

lindsayad commented 4 years ago

This is very enlightening. Yea, I definitely hadn't been thinking about the dynamic size aspect, which yes we almost certainly would be doing.

On the other hand, for a GeomReal we wouldn't be building stuff in a specific Mesh or anything, we'd be just doing a plain heap allocation at worst.

Persistent GeomReals would all come from the the Node::_coords data member because in the typedef design I would make Point inherit from VectorValue<GeomReal>...so we are building stuff in a specific mesh...?

Yea, at this current time, I don't anticipate sending and receiving GeomReals outside of the context of sending and receiving Nodes. So it seems like specializing Packing<GeomReal> and calling Packing<GeomReal>::foo when calling Packing<Node>::foo should be enough for now?

roystgnr commented 4 years ago

Building stuff in a specific mesh,but never doing complicated stuff like repairing neighbor links, consolidating duplicate entries, etc. Much easier problem.

Fixing up Packing ought to get you 80% of the way there. Especially if you do it the smart way: rely internally on Packing, and also partially specialize Packing for is_fixed_type cases so you dont need two separate code paths for Real vs GeomReal.

lindsayad commented 4 years ago

Much easier problem.

😃

also partially specialize Packing for is_fixed_type cases so you dont need two separate code paths for Real vs GeomReal.

Definitely, that was what I was thinking.