[VectorOps] Implement Lowering of Various Stride operations to LLVM

nicolasvasilache commented 4 years ago

This is a bigger task that is composed of the following smaller subtasks:

[ ] lower vector.strided_slice to LLVM
[ ] lower vector.insert_strided_slice to LLVM
[ ] lower vector.extract_slices to LLVM (via partial lowering to other vector and std ops: we should not try to create a descriptor for a tuple<vector<> ... > at least not for now)
[ ] lower vector.insert_slices to LLVM (via partial lowering to other vector and std ops: we should not try to create a descriptor for a tuple<vector<> ... > at least not for now)
[ ] write simple tests that can also be piped through the llc and opt tools to exercise LLVM's peephole optimizer and make sure that we are able to get to lower-level vector broadcast and permute operations without roundtrips to memory

nicolasvasilache commented 4 years ago

Also + @tetuante and @AlexandreEichenberger in case of interest

aartbik commented 4 years ago

Progress so far:

https://reviews.llvm.org/rG65678d938431c90408afa8d255cbed3d8ed8273f https://reviews.llvm.org/rG2d515e49d89c0738ccef8f1733d5f9afe00ee979 https://reviews.llvm.org/rG0361a961c2417756eaf28d3debe484e84d484f04

aartbik commented 4 years ago

These two CLs implement the progressive lowering (slices to slice ops) and LLVM lowering (into LLVM IR). Now the examples run "end-to-end" (taking VectorOps) as input):

459cf6e5006a [mlir] [VectorOps] Lowering of vector.extract/insert_slices to LLVM IR 303fddeeab10 [mlir] [VectorOps] Rewriting of vector.extract/insert_slices to other vector ops

What remains is a few tests to inspect generated assembly.

(note, since we are tracking this on bugzilla too, we can close this bug here I suppose; I don't have the permissions to do so, however).

tensorflow / mlir

[VectorOps] Implement Lowering of Various Stride operations to LLVM #327