Closed isaaclegred closed 1 month ago
I guess that also works, it will be the case that there are many such call sites, so it will likely clutter the code. More generally, I would expect that if max(x)
is defined for both DataVector
and double
and max(x, y)
is defined for DataVector, DataVector
, DataVector, double
, double, DataVector
, that additionally we should have an implementation for double, double
.
Also consider we may not know they types of x
and y
, so having the overload seems like the easiest change rather than having to add using std::max
everywhere.
template <typename T1, typename T2>
void my_function(..., const T1& x, const T2& y) {
... // some other work here
const double the_max = max(x, y);
...
}
Not knowing the types is exactly the intended use of the using
statement. If you know they're doubles you can just call std::max
explicitly.
Importing stuff from std::
for overload resolution is a common C++ pattern. There are ~200 examples in SpECTRE. I'm not seeing a reason to avoid it here.
(If we do end up merging this, you're missing the include for the functions.)
For me personally, it seems a bit tedious if I want to do max(x,y)
(and I don't know x/y) to have to remember to add using std::max
before it every time. Moreso, I think we should be consistent with the already current fallback for min/max
with a single double
.
@sxs-collaboration/spectre-core-devs I don't think anyone's going to convince anyone else in the current discussion. Can we get some more input?
To me it looks like there are 2 related but somewhat separate things.
One is transparent usage of overloads. This is a somewhat low-level C++ issue with overload resolution. The preferred solution given by the C++ Core Guidelines is using std::BLAH
. https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#c165-use-using-for-customization-points These customization points are a bit tedious, but for some things like std::swap
can actually be very important. The issue with globally importing things, which this is effectively doing, is it can also wreak havoc on overload resolution and ADL in different ways. E.g. what might be a compiler error now would lead to an implicit cast plus a call to a function. There is also a consistency issue, do we then define swap
for fundamentals as well? What about other functions like std::abs
? This can cause quite the snowball effect and is quite tedious to maintain. Again, it also can actually cause issues where you would have expected a compiler warning but now don't get one. It's annoying, I agree, and I wish we had std::double
instead of double
but I think we should stick with what is recommended and used by the C++ community overall.
An apparent inconsistency with the current min/max
for single arguments. I don't see that being the case. The C++ standard does not provide such a single argument implementation (https://en.cppreference.com/w/cpp/algorithm/max) so we do this purely for interoperability with Blaze. If there were a std
-supplied one the recommendation would be using std::max
. Since we are not allowed to put anything into namespace std
(broadly speaking anyway), we have no choice unless we want to if constexpr (tt::is_vector_v<T>) {...} else {...}
which is a lot more tedious than using std::max;
. Heck, even using namespace std
at functional level is not completely insane (I don't love it, but likely fine if done inside a function that's inside a cpp file). I'm personally still of the opinion that we should've put all our code inside a spectre
namespace so that things like export/import and general interoperability with other packages would be easier. Then it would be clearer that the single and double argument variants are not so different at the call site. However, I don't agree that these two cases are equivalent and should be treated the same.
On the first point, I don't think the case given above is analogous to what will be going on in the example. In numeric code, where we expect the argument to be either a DataVector
or double
we don't want "generic" code, there's no reason why any types should be going into these functions except for double
and DataVector
. More importantly, this code should fail if someone attempts to pass in a type that is not double
or DataVector
.
On the second point
so we do this purely for interoperability with Blaze
This is my only real point, I don't think from a technical point of view we need this, but it makes interoperability with blaze significantly easier. It may be tedious to maintain, but so is code that has 30 instances of using std::max
, which also puts a bunch of functions we don't need into the global namespace. I'd even be fine to at the level of an anonymous namespace in the cpp file define the function, but I see no reason to add 25 lines to the file to make the code less readable to increase the chance of having a bug by putting more things in the global namespace, and I don't think the standard is really suggesting that is what we should do here.
Being able to operate on more than 1 type is generic code. While it doesn't require all types to work, it works for various types. Any class that support the given operations is acceptable, eg something like modal vector could be fine too. Additionally, a float would work too, or a single-precision equivalent of DataVector.
I don't think this actually has anything to do with Blaze. You are running into an issue with ADL, fundamental types, and the standard library. It looks like the same case, but it's not that you are absent a standard implementation of a function, you just are trying not to use it (albeit, because C++ makes it more difficult). This means you will actually shadow the STL implementation in cases, and that means there is no way for someone to actually choose whether to use your implementation or the STL. Now, in this case they happen to do the same thing, but that's not guaranteed. Also, using
directives don't import into the global namespace, they import into the local scope. This is a key difference. If you use a using
directive in a header, the function has not gained visibility outside of the scope of the using
. If you import it through a header that sticks the function into the global namespace, then it is visible in the global namespace to everything that includes any header file that transitively includes your header. That is, one of these is a local change, the other a global change. Will and I (and the Core Guidelines) are advocating that local changes are easier to maintain than global ones because weird collisions at link time are less likely. For example, we may run into issues linking against other codes that do something similarly frowned upon. If we include a header from a library that does this, you then get duplicate function definitions which, at best, manifests as a compiler error, or at worst a linker error. The reason strong namespacing is encouraged, and encapsulation more broadly, is actually to reduce maintenance work. This is in part because even if something is inconvenient, if it's in a namespace I can ignore it and stick my obviously better solution in a different namespace :D Now, if we had put everything in a spectre
namespace, then adding a max/min
of double
inside the spectre
namespace wouldn't really be an issue. Everything I just described is sidestepped because ideally nobody else is using the spectre
namespace. Maybe introducing a spectre
namespace is something we should revisit more seriously...
Ok, I certainly concede that local changes are easier to maintain than global ones, and I absolutely buy that a spectre
namespace is the right way to handle this problem in the long (short?) term, but I'm still not convinced that the case that is being addressed in the cpp-core-guidelines example is analogous to what will happen in generic, numeric code. It seems to me like the whole point of DataVector-double-ModalVector?-float?
polymorphism should be that someone reading the code shouldn't have the clarity of the computation bogged down by worrying about the types of the arguments to the functions at every numeric function call. Now it just so happens that certain functions cos, sin, +, -, ...
are in the global namespace, and max(double, double)
is not; I think it's pretty clear that if they were in the std::
namespace and we followed this convention then code would be effectively unreadable (i.e. just because something is consistent with the standard doesn't mean it is a good idea).
I concede adding it to the global namespace is not the right call, and I'm happy to close this PR, but I think we should seriously consider alternatives to the using std::blah
option, it introduces all sorts of problematic human-readibility challenges. For example, should it be stated at function level? Then refactoring code into helper functions gets harder cause you have to keep track of if you used any of the std::
functions in the part of the code you factored out. Should it happen at every call site? Well then you may just have 10 lines of code with every other line using std::something
that makes it much harder for a beginner to follow the flow of the calculation. I think anything between these options (except maybe scope level?) would be ad hoc and pretty difficult to standardize, so I think it has to be one of these two options.
Proposed changes
There is currently an implementation of
max()
with a singledouble
argument, so thatmax(x)
has the desired behavior whetherx
is aDataVector
ordouble
. This PR makes an additional implementationmax(x, y)
forx, y
doubles. This behavior is consistent with theDataVector
behavior which is already inherited fromblaze
in whichmax(x, y)
withx
aDataVector
, andy
adouble
returns aDataVector
of element-wise maxima. Additionally implemented: same thing but withmax
->min
.Upgrade instructions
Code review checklist
make doc
to generate the documentation locally intoBUILD_DIR/docs/html
. Then openindex.html
.bugfix
ornew feature
if appropriate.Further comments
This is necessary for an additional PR I'll be making soon but I'm trying to reduce that PR's size