Zero/Verify Output Fields Set in QFunctionApply

jeremylt commented 5 years ago

In development of the NS example, we discovered that a QFunction author may not set all output values. This may lead to junk values in the output of the Operator. I should avoid this by modifying PR #174 to zero out the output vecs for the QFunction prior to using them.

I think I will also try to make a quick version of SetValue if the value is zero (wrapped in the same function call).

@jedbrown, @valeriabarra

jedbrown commented 5 years ago

An alternative that would save the (small) performance cost of extra zeroing would be to verify by populating the output vectors with NaN (or using Valgrind's API to mark it undefined) and confirm that it is defined after.

jeremylt commented 5 years ago

I thought about that as well. I suppose one question is if we want to adopt the convention that QFunction authors must explicitly set all fields or if we want to adopt the convention that all fields not set are zero. This is a ~~nonissue~~ different issue if we want users to explicitly set all outputs for clarity.

Setting the memory to undefined/NaN and checking could help users debug complex QFunctions with branching logic.

I was thinking of switching to using memset for SetValue with 0.0 to make the zeroing almost negligable.

jedbrown commented 5 years ago

I think users should generally not declare a dependency if they don't intend to set output fields. An alternative would be to explicitly adopt the convention that QFunctions always sum into their output fields, so that multiple QFunctions could be called in sequence. We'd have to test that they don't overwrite. There isn't much difference between memset and a loop in user code, FWIW.

YohannDudouit commented 5 years ago

I don't think this is a good idea, always sum means one more full memory swipe (We need to read before writing), this has a non negligible impact on performance (especially for GPUs).

jedbrown commented 5 years ago

I don't think it's so clear-cut.

explicit zeroing has the same concern about memory traffic
it may be resident in cache and/or fused/unrolled into registers by a JIT
if we want to have the abstraction of composing QFunctions, then the alternative to summing is to write into separate memory and sum those outside the QFunctions (more register/cache pressure)

jeremylt commented 5 years ago

I agree that for clarity/debugability/etc it is preferable that users set all outputs.

WRT summing, I think this might be the preferable long term approach as we move to operators with multiple QFunctions. It is a lot of movement to write into a separate location and then add each time. I'm thinking if there is a clean way to do that without requiring the user code to write +=.

This discussion will become more complex when we consider more complex operators that have QFunctions with different restrictions/bases for different elements leading to different numbers of quadrature points (mixed topology meshes, mixed order meshes, different finite element spaces for different parts of the PDE, etc)

We perhaps will have Operators with multiple compatible QFunctions and Operators with multiple incompatible sub-Operators (each with compatible QFunctions internally).

YohannDudouit commented 5 years ago

The question is not so much when we have to zero something but when we don't. If we always add, it means that we have to zero something first, instead of just overwriting the memory. (That's 3 memory swipes instead of 1) I think that the issue is more that you should not have been in a case where all values are not set. It is less expensive in general to set all values when computing, than zeroing first and then setting only the required values. I'm not sure that adding QFunction is an important feature of libCEED, I think this should be discussed.

jeremylt commented 5 years ago

Composing PDEs out of less complex operations is something we want to let the users do, so we should have the ability to add together the effects of multiple QFunctions or Operators.

YohannDudouit commented 5 years ago

If you want to have Operator that add instead of set. I think a good way to specify this is to allow a second CEED_VECTOR_ACTIVE as input, meaning that outvec will be both input and output.

YohannDudouit commented 5 years ago

Adding Operators doesn't make sense as they might use different interpolations, or emode, and thus there is absolutely no difference with calling them in sequence. Adding QFunction presents little interest to me as it is relatively simple to provide a function to a QFunction that does the sum of two other functions:

Y f1(X x);
Y f2(X x);
Y addf1f2(X x){ return f1(x) + f2(x); }
CeedQFunctionCreate(addf1f2);

jedbrown commented 5 years ago

I really don't think this is important right now, but the adding composition above implies space to store f1(x) and separate space to store f2(x). I'm entirely happy to continue with these functions writing into (not adding to) their output vectors, but I think there are merits to either choice (and also that this thread is a horrible tangent; just pick one -- the current choice is the easiest to stick with).

CEED / libCEED

Zero/Verify Output Fields Set in QFunctionApply #181