Closed mihaibujanca closed 7 years ago
What I am trying to do is something along the lines of
local G = Graph("G", 7,
"v", {N}, 8,
"n0", {D}, 9,
"n1", {D}, 10,
"n2", {D}, 11,
"n3", {D}, 12)
local weightedOffset = 0
for i=0,3 do
weightedOffset += Weights(G.v * 4 + i) * Offset(G.n..i)
end
(I know the above code might not be valid but gets the point through)
I'm not sure I fully grok the code sample, but the answer to the initial question is to make W
use the same dimensions as A
, and either use K
components or K
different W
images (W0,W1,
...). Then you can index into W
(or W0,W1,
...) am A
using the same index, whether a stencil offset or a graph node.
How would I use K components in an Array
?
If your datatype if float, and K is six, make the datatype float6.
Oh ok. And would I be able to access that like Weights(G.v)(0)
then?
Yes. If your datatype has more components, then you'll need to do a small amount of indexing math, but it still should be straightforward. There is a Slice(image, startComponent, endComponent) construct you could use if you want as well to make a sort of "view" on the image of only some components:
W0 = Slice(Weights,0,2)
but it probably isn't any easier than doing the indexing without it. Our ARAP examples used it originally, but we found it improved clarity to separate the position and rotation into separate Arrays instead of doing it within the energy function using Slice
.
Oh that's useful to know! Beyond the code being easier to understand, is there any advantage to using one approach over the other?
There is interplay between the memory systems on NVIDIA GPUs that changes with compute capability in terms of minimizing cache misses and coalescing memory fetches. One might be slightly faster than the other on your device. That is a last-mile optimization I'd ignore until absolutely necessary.
Thanks, this was useful!
Suppose I have an image
A
, and for each pixel there areK
associated weights,W
. If I know that the size ofW
isK * A.size()
, and that for every pixel, the corresponding weights are atpixel_position * K + i, for i in 0,K
, is there any way of putting this into Opt code?The only way right now seems to be creating a
Graph
withK
nodes, however this can be impractical (and for a large K, it could get pretty bad). Would anything along the lines of makingW
of size{A.size, K}
(as opposed to{A.size()*K}
) work, or is there any other way?Could this somehow be done with a
Stencil
instead?