Closed newling closed 3 months ago
Adding a flag suggested by @MaheshRavishankar to avoid weight inlining works excellently to reduce the number of dispatches, so now running the following compilation:
iree-compile --iree-flow-inline-constants-max-byte-length=0 \
--iree-hal-dump-executable-sources-to=./dispatches \
--iree-hal-target-backends=llvm-cpu \
esrgan_fp32_linalg.mlir \
-o cpu_exec.vmfb
we have just 20 dispatches. 7 of these don't contain any linalg operations, they're just flow.dispatch.load and flow.dispatch.store ops. Of the remaining 13 dispatches, 12 contain linalg.conv_2d_nchw_fchw, and the other one is just a linalg.generic with all "parallel" dimensions.
I've stitched the dispatches back together in the following gist: https://gist.github.com/newling/6828571a28ac2a2c33cd35c355ac903b
I downloaded esrgan_fp32_linalg.mlir provided by @vivekkhandelwal1 in a teams channel, and ran
This dumps the dispatches as individual mlir files in the directory "dispatches".
How many dispatches are there?
There are 360 distinct dispatches. This is more than expected, as there are not that many distinct layers in the mode. But some of these dispatches are identical except for weights. For example when I compare 2 dispatches as follows:
I see that the only difference is in the weights that are embedded in the dispatch. Is there a way to make the weights to function operands, so that the functions are unique?