Closed leekillough closed 3 months ago
no issues. It inspires me to do more advanced template programming
I once did template metaprogramming for expression templates, where a complicated expression like alpha * A * B + beta * C
is automatically converted to a DGEMM
call. I could send you my presentation slides on it, which is also an introduction to C++11 because it was around the time of C++11's ratification.
Advanced template programming techniques include SFINAE (Substitution Failure Is Not An Error), which is used to make templates conditional, like:
template<typename T, typename = std::enable_if_t<std::is_same_v<T, double>>>
T func(T x) {
}
The function definition will only be considered if T == double
. Unlike a simple overload of f
with double
, this definition does not participate in the overload if the enable_if_t< ... >
condition is false.
There is also CRTP (Curiously Recurring Template Pattern), where a derived class is passed as a template argument to its base class, so that the base class can perform polymorphism statically at compile time instead of using virtual functions:
template<typename T>
class base {
public:
T func() { // not virtual
return static_cast<T*>(this)->func(); // casts this to derived * and called derived::func() statically
}
};
class derived : base<derived> {
derived func() { // not virtual
return derived{};
}
};
This fully templatizes the floating-point helper functions. Previously, the short ones were written explicitly in
include/insns/*.h
. Now all of them are instantiations of floating-point helper template functions, in preparation forfloat16
andbfloat16
support. This also improves test coverage by makingfloat
anddouble
instructions use the same code.UnBoxNaN
performs the opposite of the existingBoxNaN
, so that theGetFP()
function is much simpler. It is also fully templatized so that it can supportfloat16
andbfloat16
in the future.CvtFpToInt
was renamed tofcvtif
to resemble the naming scheme for instructions, where the destination type and register (integer) is listed first, and the source type and register (float) is listed second. The template parameters indicating the source and destination types are listed in the same order.