Closed robertmaxton42 closed 6 years ago
God, this whole strides/offset business is such a giant can of worms. Hopefully I got it right this time. At least now Reikna's Array
objects know about their offsets and can be used with *_like
functions safely.
I wasn't sure what is the best way to expose the padded array creation, so for now it is just Type.padded(dtype, shape, pad)
, so that you can pass it to e.g. temp_array_like()
.
BTW, the business with ignoring strides seems to be fixed by this massive PR to PyCUDA. Hard to tell when it will be merged though.
Also, besides padded()
, an arbitrary buffer size can now be specified by the nbytes
keyword to Thread.array()
and other array-allocating functions.
It works! ... well, at least the toy example I was testing with works! Now to try and get my original code working again.
Thanks for all the help!
Glad to hear it! Feel free to close the issue when you have tested your code.
Question: is it in an intended property of temp_array
s that one temp array might live in the offset/padding bytes of another? In retrospect this seems plausible but it wasn't immediately obvious to me that that was the case on a first read.
Technically, the only guarantee is that the "virtual buffers" of two temporary arrays used in the same kernels do not intersect. But at the moment only two temporary managers are available - the one that allocates a separate buffer for each temporary array (used for testing) and the one that puts each temporary array at the beginning of a physical buffer (the default one). I had an idea of creating the one that would try to pack them at different offsets, which would make it somewhat more memory-efficient, but at the moment offsets did not work well, and it also requires a more complicated algorithm.
Ah, good to know. At any rate, it seems to me that there's no way to zero-initialize a temp_array
's offset bytes without resorting to a kernel built for the purpose -- I think that belongs in this issue, as it's still padding-creation relevant?
At any rate, it seems to me that there's no way to zero-initialize a temp_array's offset bytes without resorting to a kernel built for the purpose
Or, alternatively, you could have an a condition in the kernel that uses it, which just gives 0 instead of reading from memory.
In general, it feels to me that if you need some specific values before the actual start of the data, then maybe the actual data should start there instead. And if you want some kind of an automatic initializer for a temporary array, I think it's a separate issue.
How's your original code, by the way? Is it working with the recent changes?
Not working because of zero-initialization-in-padding-bytes problems :P.
Buuuut that is now officially fixed and the code is working! (... Well, there's still bugs but it's not reikna-related, so closing the issue!)
Currently, there is no way to specify temporary arrays or parameters with arbitrary padding in reikna. In particular, while arbitrary padding of axis beyond the 0th can be achieved with custom
strides
and arbitrary initial padding can be achieved withoffset
, there is no way to specify an array with padding after the data.