The device-slicing logic was broken on the C++ level, due to the fact that the buffer device-refcount for crops and slices assumes it was cropped/sliced from a buffer with an equal number of dimensions. This is incorrect for both cropping AND slicing:
slicing from a buffer reduces the dimensionality (this is obvious)
cropping from a sliced buffer inherits the "cropped_from" field from the sliced Buffer, which which has a higher dimensionality.
This was never tested for in the tests as they were all done with variable number of dimensions (i.e., AnyDims). Thus, I added a test that does a device slice on a Buffer<int, 3>.
The solution is to store that "cropped_from" Buffer as a generic Buffer<T, AnyDims>.
The device-slicing logic was broken on the C++ level, due to the fact that the buffer device-refcount for crops and slices assumes it was cropped/sliced from a buffer with an equal number of dimensions. This is incorrect for both cropping AND slicing:
This was never tested for in the tests as they were all done with variable number of dimensions (i.e.,
AnyDims
). Thus, I added a test that does a device slice on aBuffer<int, 3>
.The solution is to store that "cropped_from" Buffer as a generic
Buffer<T, AnyDims>
.