Closed mattkretz closed 7 years ago
This interacts with my to-be paper P0553R0, which defines popcount
and countl_zero and countr_zero for unsigned integral types [names up for debate]. I would expect, for example, popcount(datapar<T>(...))
to operate on each element of the datapar.
Assuming that masks are (in that sense) just a datapar<bool>
, this doesn't really fit. Put differently, a popcount (etc.) is a reduction operation and should be named / used as such.
So, we have option 1: Have some generic reduce on masks that can do such things (no idea on the interface; certainly different from reduce on datapar if we want to implement "first_set" with it. Or option 2: Rename to hpopcount and hfirst_set etc, similar to hmin and hmax (with names aligned with whatever the outcome of P0553R0 is).
I'd expect:
datapar<uint32_t> x = [](uint32_t i) { return i; }; // [0 1 2 3 | 4 5 6 7]
x = popcount(x); // [0 1 1 2 | 1 2 2 3]
... and int popcount(bool x) { return x; }
to be quite meaningless.
All the mask
reduction functions produce a scalar. I.e. they are horizontal reductions implicitly, because nothing else makes sense on a datatype that only has two values. Do you really think we should call them hall_of
, hany_of
? ;-) Otherwise we'd be inconsistent again.
I agree that the popcount(datapar) example you showed would be the "obviously desired" outcome. What's wrong with it?
Suppose you haven a large array of bits, where a bit is set if the corresponding member of the base population has a certain attribute. (Think "is female".) You're doing this with a separate bit array because it's more cache-efficient. Now you want to count how many females there are. It seems SIMD popcount is the natural fit, with intermediate results SIMD-summed and then reduced via hsum at the end.
I do agree that popcount(bool)
makes no sense, unless naturally completing the picture without adding dangers.
Regarding reductions: All reductions (not just mask reductions) produce a scalar, that's their defining property. For datapar, we have vector operations (e.g. + or sin) and we have reductions (e.g. hmin or reduce). On masks, we have vector operations (e.g. || or &&) and we have reductions (any_of, all_of, "popcount"). Maybe reductions on masks are identified by "_of"? popcount_of? We would then have something like
int popcount_of(mask m) { return popcount(m.to_scalar_one_bit_for_each_element()); }
I agree that the popcount(datapar) example you showed would be the "obviously desired" outcome. What's wrong with it?
It's not a horizontal operation in the sense that it returns a scalar. It's a horizontal operation on the bits of the elements and thus returns a vector. That's not wrong. Just possibly surprising.
Considering that a mask has bool
elements there can be no (meaningful) horizontal operations on the elements. Thus, mask rather is a bit array in the face of horizontal operations.
With all that I'm trying to say that popcount(mask)
is unambiguous.
Maybe LEWG has an opinion on this. For now, removing popcount(bool) seems like the easiest good choice.
Let me try to formulate a "poll question":
Should there be overloads for where
and (mask and datapar) reductions with bool/builtin types in place of mask/datapar?
If no, that makes things easier here, but adds a burden for users that want generality.
OK, so P0214 should provide bool
/ builtin arithmetics overloads. AFAICS the where
overload is safe. The reductions are not safe because of implicit conversions to bool
.
How about an argument of the following type instead of bool:
class nonconvertingbool {
public:
nonconvertingbool(bool);
nonconvertingbool(int) = delete;
operator bool() const;
};
Makes anything that is convertible to bool
and int
ambiguous and int
explicitly deleted.
Ouch. That wasn't intended. Good catch. The intention was to support generic algorithms that can be instantiated with builtin and datapar types. Two options: a) drop the scalar overloads b) template enable_if_t<is_same<T, bool>::value, int> popcount(T);