[DISCUSS] Support kDLBool type

alonre24 commented 3 years ago

To represent tensors of type bool - is it possible adding to the DLDataTypeCode enum another type (kDLBool)? That can be very useful, as some well known machine learning platforms (TF, torch, Onnxruntime) support models whose inputs/outputs are tensors of type bool. Thanks!

tqchen commented 3 years ago

bool is encoded as UInt(bits=1, lanes=1) type. One thing that is indeed true is that we have not clearly specified how to store the sub-byte types, in this case they are stored as the byte

alonre24 commented 3 years ago

As I understand, the size of bool element is usually one byte, so representing it with bits=1 seems not appropriate. On the other hand, using UInt(bits=8, lanes=1) to represent bool doesn't allow distinguish between bool and uint8 types. Therefore, I suggest adding bool to DLDataTypeCode - both for having the flexibility of the desired bool representation (1 or 8 bits), and for not overlapping with existing types.

tqchen commented 3 years ago

Thanks @alonre24 On compiler projects like LLVM, uint(1) is used to represent the bool type(think it as a register type). That does not preclude us to think of a storage type of the element type. In this particular case, I believe specifying the storage type clearly for sub-byte types would be sufficient.

lantiga commented 3 years ago

Thank you @tqchen, there seems to be different ways in which the different frameworks decide to implement bool. It is not guaranteed to be uint(1), for instance in PyTorch is uint(8), which makes it undistinguishable from a tensor with a uint(8) dtype. There are talks about creating a bit tensor in PyTorch to make bool sub-byte, but at the end of the day one has no guarantees there.

I'm of the opinion that having a separate DLDataTypeCode for bool and then using the bits field to describe whether that bool is held in 1 bit or byte in the buffer a would make a lot of sense for interoperability.

leofang commented 3 years ago

It is not guaranteed to be uint(1), for instance in PyTorch is uint(8), which makes it undistinguishable from a tensor with a uint(8) dtype.

Indeed, this issue would happen in MPI programs too (MPI_C_BOOL vs MPI_UNIT8_T).

tqchen commented 3 years ago

What I originally meant was to specify the sub-bit storage layout, which means uint(1) maps to the bool and stores as uint8 in DLPack.

When the bool was indeed stored as sub-byte in bitmask setting, it then becomes uint1x8(8 of 1 bits). So we can still distinguish the two using the current convention.

leofang commented 3 years ago

I just realized in CuPy we do not support exchanging bool arrays because of lack of kDLBool.

@tqchen I think this is a nontrivial interpretation that not everybody expects (nor was it documented). For example the ongoing NumPy PR did not support bool either (https://github.com/numpy/numpy/pull/19083). AFAIK only PyTorch supports it but it's unclear to me where they learned this. I am +1 for adding kDLBool for better clarity.

cc: @rgommers @emcastillo @hameerabbasi @seberg

seberg commented 3 years ago

My view is that adding a new type for bool seems best (and clearest). Does PyTorch even really support it, or just export bool as a uint8 (losing the bool type information on round-tripping)?

tqchen commented 3 years ago

Thanks everyone for great discussion so far. Trying to summarize the thread.

A0: `i1` to represent bool

We can use uint(1) to represent the intend of the data type. Additionally, the specification says that any sub-byte data type would need to be stored in a full byte. This means uint1 would need to be stored as uint8(full byte) consistent with the current behavior.

For cases where the intention is to pack the bit-masking(say pack 8 bools into a byte). The data type can be represented as a vector type uint1x8, which can be stored as a full byte.

A1: Introduce a bool type

One of the natural extension to think about this is to introduce bool with bits=8. Note that the default printing behavior might not be too attractive here, since while we can say int8, bool8 does not makes too much sense as a dtype name.

Discussions

Both A0 and A1 should achieve the goal of supporting bool in DLPack.

A0 is likely more consistent with the low level compiler implementations, for example LLVM implements bool using i1
A1 makes the choice of bool more explicit, although we could also introduce a helper macro (IsBool(DLDataType) to make things explicit in the case of A0)

seberg commented 3 years ago

For cases where the intention is to pack the bit-masking(say pack 8 bools into a byte). The data type can be represented as a vector type uint1x8, which can be stored as a full byte.

I think this as storing exactly 8-tuples of bool, under the specifications? So a shape=1 [itemsize] would translate into shape=8 [bits] booleans/bits for uint1x8? This means:

A shape of SHAPE [bits] has to be described as uint1xSHAPE, with SHAPE=15 you get 15 [bits] being uint1x15?
A multi-dimensional bit-array stored contiguously can't be described (since it would requite bit-sized strides that are not in the last dimensions/cannot be absorbed into lanes).

Also, someone might get the idea to store uint2 in a packed/unpacked format, at which point the type distinction between boolean and bit-sized integers becomes fuzzy. That might be an unnecessary "what if", but creating bool explicitly avoids it.

It seems to me that if there is a possibility of bit-sized stride support (e.g. in form of bit-arrays), creating the explicit bool dtype is a lot cleaner. If that will never happen, I still think it is nicer, but I doubt there are serious practical issues with using uint1 as bool.

I had the impression that the standard already explicitly supports bit-strides, since the header includes the line (in a comment):

size *= (t->dtype.bits * t->dtype.lanes + 7) / 8;

although, I guess there is a fair chance that nobody would do that and you can consider it undefined right now.

A0 is likely more consistent with the low level compiler implementations, for example LLVM implements bool using i1

Yes, but I don't understand that this is a reason for or against A0? The storage needs to be clearly defined as 1 or 0 (or possible value != 0 and value == 0), of course. But the type-safety aspect is unrelated to what the compiler uses behind the scenes?

tqchen commented 3 years ago

Thanks @seberg .

To followup on the conversation about the bitmasking. The particular comment of uint1x8 was with respect to the bit level case. In the most common cases we are not doing that, and uint1 plus the storage spec of sub-bit type would indeed resolve the issue(because all types padded to full byte to be stored). The same principle will likely applies to uint2 as well.

size *= (t->dtype.bits * t->dtype.lanes + 7) / 8; is a realization of the sub-byte storage convention.

To followup on the multi-dimensional bit-array. We would indeed require some form of logical to physical mapping in this case. Explicitly specifying a vector type uint1x8, or uint1x32 would corresponds to the physical storage plan of the data. Such plan is also usually easier to direct manipulate in the presence of SIMD. The logical case is a separate question.

In the case of A1, bool1 create some inconsistency with the current specification. Since the bool1 was intended as a bit array, while the sub-byte storage convention would indicate the other case(it should be stored as 1 byte). Of course as we discussed bit-array may be out of the scope, so not too relevant here. Just want to show some possible inconsistencies.

seberg commented 3 years ago

Yeah, you are right, I managed to mis-parse the code as rounding up the size rather than the itemsize :(. Although I guess many here likely didn't realize that sub-bit datatypes are standardized to be stored padded to full bytes :).

In the case of A1, bool1 create some inconsistency with the current specification. Since the bool1 was intended as a bit array

You are right. Do not use bool1 for bit-strided storage, then (by itself)! What you get is:

bool8 indicates a convention where True := value != 0 and False := value == 0
bool1 indicates a convention where True := value == 1 and False := value == 0 stored in 1 byte as per current DLPack standard
uint1 can be used for the set of values {0, 1} instead of meaning {False, True} (which could be surprising!)

(For what its worth, NumPy can't always guarantee bool1 style, as it, annoyingly, supports both in practice. But DLPack could be opinionated about bool1 vs. bool8. If you would switch to always using sub-byte strides, bool8 has to be specified as one of the two meanings.)

What about sub-byte strides?

I do agree that there is an inconsistency if you would use bool1 to implicitly indicate bit-strided (non byte-sized) storage, but not define uint2 to use this. But I do not understand that as an argument against introducing a new bool dtype. To me that is primarily a type safety (and clarity) issue – i.e. the last point.

I have heard the bit-mask requests for NumPy often enough to think that it may be good to anticipate it (not necessarily act on that right now, though). The future might for example see a flag (e.g. on the dtype field) to indicate true bit-sized storage (non byte-padded). That flag will allow both bool1 and uint2 (or even uint1) to be stored in a packed, sub-byte strided, manner.

tirthasheshpatel commented 2 years ago

@tqchen The discussion seems to have reached an impasse at the uint vs bool argument. https://github.com/dmlc/dlpack/issues/75#issuecomment-856241824 is a nice summary of the issue. Here is my opinion about the thread:

Using uint(1) or uint(8) as bool: This seems obscure to me as suggested in https://github.com/dmlc/dlpack/issues/75#issuecomment-856220450. Another argument against this is that there is no way to distinguish between uint8 and bool. But @tqchen noted that it is more consistent with the lower level implementation in compilers e.g. LLVM.
Using kDLBool: More explicit and easier to use. Clear distinction between uint8 and bool. Worst case, introduces some redundancy for low level compilers like LLVM (but that doesn't seem to be a problem (like introducing ambiguity), right?)

I would vote for an explicit kDLBool since its more explicit. I admit, I am not aware of the problems with low-level compiler projects like LLVM but I don't think that adding a kDLBool should cause a problem when exporting at such low levels. @tqchen Do you agree with this? If yes, we can proceed with merging #76. It would definitely be nice to have the bool data type in DLPack!

leofang commented 2 years ago

@tqchen @rgommers @seberg @kmaehashi @tirthasheshpatel @jakirkham @kkraus14

I'd like us to revisit this topic, and specially I'd like to request for adding kDLBool. Look at the NumPy/CuPy/PyTorch adoptions as of today, it is clearly that none of these libraries support exchanging bool tensors, and it has increasingly become a source of bug reports (see the above linked issues). Given that these are already the 3 major Array API adopting libraries, we should add the bool type to clear any ambiguity and unblock them.

tqchen commented 2 years ago

Thanks for rekindle this topic. After thinking a bit more on this topic. I also now agree that having kDLBool is going to be helpful.

I would assume the current state would call for something like

Bool(bits=8, lane=1)

to represent the current bool type in numpy/PyTorch?

seberg commented 2 years ago

Sounds good, it might be nice to review if anyone has expects True to be only the bit set, rather than anything != 0.

tqchen commented 1 year ago

seems there is general consensus on this. We can consider a PR to add such spec. perhaps @leofang can send a PR?

I think it makes sense to require true be the only bit set

leofang commented 1 year ago

Shoooot my apology @tqchen I completely missed the traffic here for some reason. Will do later today!

dmlc / dlpack