Open IanWood1 opened 2 days ago
ew, we should really not be seeing those asserts - what's adding those?
I think this is coming from torch-mlir
's lowering of some op (not entirely sure which). Don't these get dropped somewhere around flow/stream anyway?
Nope - they make it all the way to runtime if they are outside of dispatches and as you're seeing here will have bad influences during dispatch region formation/executable generation. Asserts should only be added explicitly by users unless a debug mode is enabled, IMO. Asserts inside of dispatches are no-ops today and get removed too late, so they just make compilation worse.
They could be used for int range analysis hints in a release build - but if that's the case we should probably absorb them into the int range ops at input time instead.
Asserts should only be added explicitly by users unless a debug mode is enabled, IMO. Asserts inside of dispatches are no-ops today and get removed too late, so they just make compilation worse.
That makes sense, I think they are there to conform with pytorch ops specs. We don't currently have a "debug/release mode" right?
@MaheshRavishankar do you have any suggestions on how to fix this?
--iree-opt-strip-assertions
can be used to strip them near input-time (I forget if it walks into linalg ops, but it should).
As a middle-stage compiler we want debug options for assertions that come in as user input to be controlled by the user creating the input - it's not possible to know if an assert was added by the user, a dialect conversion above us (like this), etc. If a dialect is inserting assertions it'd be nice if it had an option to stop inserting them.
For now though, --iree-opt-strip-assertions
unless you're testing correctness (and even then as seen here we'll never report assertions inside of dispatches today, though we could in debug modes if it ever proved useful - it's just really tricky logic per backend).
I thought that is on by default?
Doesn't seem like it. It could be. I suspect only a fraction of users care about the assertions and more would be confused by how badly they mess up performance.
@IanWood1 maybe start with adding this flag to all the CI tests, and a separate PR that turns it on by default.
I tried turning it on in https://github.com/iree-org/iree/pull/19014 but I didn't realize assertions don't get stripped until after hoisting, so there is no effect on dispatch count. Should this pass be moved? There is a comment explaining why it need to be after optimizations:
Good catch. That may not be true anymore now that we have information coming from the frontend and util.assume
- we could lower the assertions to those assume ops prior to removal as one of the first steps.
(oh and I'm pretty sure we don't derive information from the assertions today - so it'd be safe to move now!)
The most recent llvm integrate, https://github.com/iree-org/iree/pull/18987, introduced a minor regression in SDXL clip dispatch count (1139 ⇾ 1141). I tracked it to https://github.com/llvm/llvm-project/commit/df0d249b6511289f1e8c1389f4fd33d7b4c083fa. I was able to restore the dispatch count by locally reverting this single commit.
Here are the 2 additional dispatches after LLVM integrate:
Command used:
MLIR:
It appears these
linalg.generic
s were getting CSE'd before the change but can't anymore because of thecf.assert
which have side effects.