tensorflow / mlir-hlo

400 stars 71 forks source link

Missing CMake deps for THLO and suspect layering #45

Open stellaraccident opened 2 years ago

stellaraccident commented 2 years ago

Can someone please land an equivalent of this change?

https://github.com/iree-org/iree-mhlo-fork/commit/a6a3308969278cf9ec36132fc9235314fd091c23

I only did a little bit of poking to get a minimum size patch, but the dependency chain and code organization of these pieces seems wrong. Can it be improved? Many of us are not using mlir-hlo as part of XLA where maybe this is tolerated, and it is counter-intuitive that the code is laid out to have such a large transitive dependency set of compiler internals just to get the top level dialect.

stellaraccident commented 2 years ago

Additional fix needed to satisfy the BFD shared linker: https://github.com/iree-org/iree/commit/61e0e46e27663acc33405e056195371bf73dbcad

stellaraccident commented 2 years ago

Pushed a review upstream that will make this more strict and less likely to be caught by downstreams: https://reviews.llvm.org/D131911

stellaraccident commented 2 years ago

Using my super-powers to look at the internal CI, I see that it is:

  1. Not building the Python bindings at all.
  2. Using LLD vs BFD (shouldn't matter with the above upstream patch but BFD tends to be a bit more strict in the absence of further guidance).
  3. Doing a static library build (vs shared, which tends to surface more issues).

I would address 1 by enabling the Python bindings. And for 3, I would change to BUILD_SHARED_LIBS=ON and see what breaks.

I wouldn't do anything about 2 (the above patch to LLVM should make them all strict).

stellaraccident commented 2 years ago

Upstream patch landed. If you now enable the Python bindings in your CI at that patch or later, I expect that you will get a failure equivalent to what we experienced downstream.

stellaraccident commented 2 years ago

(and this will also make sure it works on Windows)

stellaraccident commented 2 years ago

(separately, we would also like the project layered better so that it is possible to depend on MHLO and specific conversions/passes vs the transitive "everything")

burmako commented 2 years ago

Stella, thank you very much for filing the bug report and for digging into the details! We'll get this handled.

sherhut commented 2 years ago

Thanks for the report and patch. As you noted, we did not see this issue as it does not show up on our CI.

We are not using the cmake build files ourselves and did not notice the layering violation. I have split out the thlo lowerings into their own target in commit 21a5d7d7dd722763e03dc447f73276a4b83f53be. I hope this fixed your immediate layering concerns.

As to the dependency of thlo on gml_st: We will move interfaces into their own targets eventually. We have forked the upstream TlingInterface to prototype a different version before bringing it up in an RFC, so the current layering in cmake is temporary.