Open kostrzewa opened 7 years ago
In practice one can simply disable half precision.
I have realized the complication of the half
type only a week ago, therefore one cannot just have scalar values as FT
because the conversion is not the one that we would expect.
It has just struck me what we might be able to do instead: C++ allows for implicit conversion operators. So I would throw in the following to the discussion: An actually new type that is not just a typedef. This means that implicit conversions from double
to Half
are not possible without going through the constructor that we write. Then all the intermediate functions on the QPhiX side would use this Half
class, the kernels still take half
. Arrays will not be converted automatically, so that is an issue, but double *
is not converted into float *
either. This is a complete example that compiles and works:
#include <iostream>
typedef unsigned short half;
// Mock of the `rep` function to see whether it works.
template <typename A, typename B>
A rep(B b) {
return b + 10;
}
class Half {
public:
// Implicit constructors.
Half(half const x) : x_(x) {}
Half(float const x) : x_(rep<half, float>(x)) {}
Half(double const x) : x_(rep<half, double>(x)) {}
// Conversion operator that allows using the `Half` class in places where
// `half` is required.
operator half() { return x_; }
private:
half x_;
};
void func(half x) {
std::cout << x << std::endl;
}
void intermediate(Half x) {
func(x);
}
int main() {
double a = 1.0;
intermediate(a); // Program will print 11 here.
}
The double
is converted to half
via the Half
class implicitly in the chain of function calls. The bad thing is that the built-in conversion (which is not what we want) will also happen implicitly without an error. One should just have not abused the unsigned short
in the code, it is bad enough that the Intel intrinsics use that type.
And while we are at it, the sizeof(unsigned short)
is probably not guaranteed to be sizeof(float) / 2
, right? So it would be better to use uint16_t
in order to get the correct number of bits.
This class Half
would require to change pretty much all points in QPhiX where numbers are passed. Perhaps just passing FT = Half
as the template parameter will make it magically happen as we want. Probably it is just easier to use rep<>
in the locations where needed and have test cases to check that one did not miss anything.
@martin-ueding
As a result, https://github.com/kostrzewa/tmLQCD/blob/3285769b5b279a31d15265d0bd07ac1e8485ca5e/qphix_base_classes.hpp#L45 is unlikely to work correctly...
In fact, the fourth argument in https://github.com/kostrzewa/tmLQCD/blob/3285769b5b279a31d15265d0bd07ac1e8485ca5e/qphix_base_classes.hpp#L129 is converted to int because of the sign...
Similarly, our packers probably fail (silently) because of this and we might need to use rep<FT,double> for this.