Open seven-mile opened 1 month ago
Current CIRGen may emit
%vi4res = cir.vec.create(..., %vi2a, %vi2b)
for the source OpenCL codevi4 vi4res = (vi4)(vi2a, vi2b)
, and end up with "inserting elements typedvi2
into a vector typedvi4
in LLVM IR".
I don't remember offhand. Does this seems like something done by design (i.e. we already have testcases for this) or is it something we forgot to verify?
Looking at VecCreateOp::verify
my impression is that this isn't supported, didn't you get verification errors?
The corresponding implementation from OG CodeGen is here. It uses shuffle operations to extend two vectors and merge the effective elements into the final result.
We can make it CIRGen or Lowering (keep the
cir.vec.create(%vi2a, %vi2b)
in CIR, rather than emitting shuffles immediately). I prefer CIRGen still.
Whatever we decide to do on CIRGen, we need to make sure that the corresponding LLVM lowering should match what OG codegen does (in this case it shall be series of shuffles). However, if we could do better in CIRGen to map the semantics in a more clear way, we should do it - if we emit shuffles in CIRGen we make it potentially harder to retrieve original information, because we need to look into the shuffle and recognize it's just joining two smaller vectores.
I'd prefer avoiding shuffles this early for this, but if it's something we are already doing, then it wouldn't be inconsistent (and we can later improve by adding other ops). I'd also be fine with improving cir.vec.create
to support the "building from smaller vectors" scenary. Another option would be to introduce operations for extending number of lanes and use that result to build the vectors, but not sure how well that feds into cir.vec.create
later.
@dkolsen-pgi, suggestions on what do you think might play better here?
GNU vectors do not support concatenating two vectors with the syntax:
vi4 res = (vi4)(a, b);
So I haven't implemented that in CIR.
I think this is best implemented with cir.vec.shuffle
rather than cir.vec.create
. Concatenating two vectors is one of the things that shufflevector is designed to do.
Works for me, though a concat op would be cool too, but perhaps we could wait until we actually have a pass that'd prefer saving some compile time by not having to look at the mask to reconstruct the concat.
Note you'd still need an operation to extend these vectors before passing them to a shuffle as input. We could probably use some form of cast for that.
Note you'd still need an operation to extend these vectors before passing them to a shuffle as input.
That's not necessary. The result vector can have a different size than the two input vectors.
Current CIRGen may emit
%vi4res = cir.vec.create(..., %vi2a, %vi2b)
for the source OpenCL codevi4 vi4res = (vi4)(vi2a, vi2b)
, and end up with "inserting elements typedvi2
into a vector typedvi4
in LLVM IR".The corresponding implementation from OG CodeGen is here. It uses shuffle operations to extend two vectors and merge the effective elements into the final result.
We can make it CIRGen or Lowering (keep the
cir.vec.create(%vi2a, %vi2b)
in CIR, rather than emitting shuffles immediately). I prefer CIRGen still.Related to PR #613 . Suggested test case: