Closed nsmithtt closed 2 weeks ago
@pjanevskiTT, with this upgrade we now are tracking metal main
This looks great, thanks Nick. Few things I wanted to ask/mention
set(ENABLE_LIBCXX OFF)
inside CMakeLists.txt in tt-metal repo. using type = std::nullptr_t;
It looks like metal builds with this change as well, so maybe we can change it in order to support building it inside tt-mlir as well. If it makes sense, I can make the change on tt-metal side
Otherwise, this PR looks great. I think development and integration is going to be very smooth this way, thanks once again
This looks great, thanks Nick. Few things I wanted to ask/mention
- When I want to build standalone metal inside tt-mlir, I need to set ENABLE_LIBCXX to OFF in order to use libstd. Is there some smart way to do this? Only way I found is basically doing
set(ENABLE_LIBCXX OFF)
inside CMakeLists.txt in tt-metal repo.
When you invoke cmake from the cmake command line you can do -DENABLE_LIBCXX=OFF
.
- Even when I force ENABLE_LIBCXX to off, build fails. Required change to make the build work is changing this line to
using type = std::nullptr_t;
Hmm, this feels like a compiler version issue. Our CI is able to build with it disabled, I think we are using clang-17 though. If there is a simple fix on the metal side to support your compiler version I'd go ahead and create a PR against metal with commit message #0: Build error supports clang-....
.
It looks like metal builds with this change as well, so maybe we can change it in order to support building it inside tt-mlir as well. If it makes sense, I can make the change on tt-metal side
Yes so metal builds with this change, left default enabled. But if you look at this PR, we set -DENABLE_LIBCXX=OFF
.
@nsmithtt can you confirm if everything works on silicon? I've been trying something out on forge side (on this branch of mlir), and am hitting some issues when creating sys desc (n300 machine)... concretely, when deviceMesh.close_devices()
is called in getCurrentSystemDesc()
, everything hangs...
@nsmithtt can you confirm if everything works on silicon? I've been trying something out on forge side (on this branch of mlir), and am hitting some issues when creating sys desc (n300 machine)... concretely, when
deviceMesh.close_devices()
is called ingetCurrentSystemDesc()
, everything hangs...
Hmm, so mine isn't hanging, but the simple_sum test does error out with OOM. I'm going to hold off on merging this until we have silicon CI in place which I think is close to landing. I don't want to destabilize.
FYI all, going to try and uplift now that we have silicon CI.
Regressions: