Closed christianjgreen closed 2 months ago
system_env in deps are only used during compilation. You want to set this flag when you call mix or your terminal directly. :)
This is the method i'm currently using, does it need to be set beforehand?
Mix.install(
[
{:exla path: "~/projects/nx/exla"}
{:nx, path: "~/projects/nx/nx"}
],
# system_env: %{"XLA_TARGET" => "cuda12"}, <-- this works
system_env: %{"XLA_FLAGS" => "--xla_dump_to=/tmp/hlo"}, # <- this does not work
)
Oh, on Mix.install, that should set the env var correctly. If it doesn't work, then maybe the flag itself is no longer relevant? Or maybe it also needs --xla_dump_hlo_as_dot
? Are you sure the EXLA compiler is used? Are you using EXLA.jit
to run the code?
Positive the compiler is being used! Let me give you the latest snippet instead of a bad copy paste :p
Mix.install(
[
{:exla, path: "~/projects/nx/exla"},
{:nx, path: "~/projects/nx/nx"}
],
# system_env: %{"XLA_TARGET" => "cuda12"}, <-- this works
system_env: %{"XLA_FLAGS" => "--xla_dump_to=/tmp/hlo --xla_dump_hlo_as_dot"}, # <- this does not work
config: [nx: [default_backend: EXLA.Backend]]
)
And these are the three different calls I've tried to get HLO from
{matrix, _} = Nx.Random.uniform(Nx.Random.key(42223), shape: {20, 20}, type: :f32)
matrix = Nx.add(matrix, Nx.transpose(matrix)) |> Nx.divide(2)
Nx.LinAlg.eigh(matrix)
Nx.Shared.optional(
:j_eigh4,
[matrix],
{Nx.take_diagonal(matrix), matrix},
&Nx.LinAlg.eigh/1
)
EXLA.jit(&Nx.LinAlg.JacobiEigh.eigh/1)
Just to make sure, are you calling the function returned by EXLA.jit(&Nx.LinAlg.JacobiEigh.eigh/1)
?
Just to make sure, are you calling the function returned by
EXLA.jit(&Nx.LinAlg.JacobiEigh.eigh/1)
?
Yes sir!
Here is the call and the returned tuple
e = EXLA.jit(&Nx.LinAlg.JacobiEigh.eigh/1)
e.(matrix)
{#Nx.Tensor<
f32[20]
EXLA.Backend<host:0, 0.3304141925.3251240981.37154>
[-1.6381525993347168, -1.3552285432815552, -1.1910206079483032, -1.0310735702514648, -0.912943959236145, -0.8215287923812866, -0.6370212435722351, -0.3248468041419983, -0.23171192407608032, -0.08701343089342117, 0.19126664102077484, 0.3433498442173004, 0.3924603760242462, 0.5175570249557495, 0.7870075106620789, 0.8510072827339172, 1.1144767999649048, 1.2940683364868164, 1.6437021493911743, 9.338674545288086]
>,
#Nx.Tensor<
f32[20][20]
EXLA.Backend<host:0, 0.3304141925.3251240981.37155>
[
[-0.07324251532554626, -0.3707551062107086, -0.3129488527774811, -0.04760716110467911, -0.14367316663265228, -0.05768333375453949, 0.26982104778289795, -0.08689552545547485, 0.01865920051932335, 0.05672043189406395, -0.11009827256202698, 0.23080967366695404, 0.010919198393821716, 0.00763005530461669, -0.04259205609560013, 0.6290202736854553, 0.2388024479150772, 0.19880637526512146, -0.1793692409992218, 0.23938940465450287],
[0.10503554344177246, 0.48980197310447693, -0.07325314730405807, 0.36990198493003845, -0.10179188847541809, 0.11219511926174164, -0.13094516098499298, 0.05496946722269058, 0.4019777774810791, 0.1967068314552307, 0.15009185671806335, -0.014761583879590034, -0.0678885355591774, -0.22851940989494324, 0.007818772457540035, 0.012614989653229713, 0.4396161437034607, 0.16732947528362274, -0.11722132563591003, 0.22061890363693237],
[0.3245624005794525, -0.16596823930740356, -0.37555015087127686, 0.10839072614908218, 0.05614854767918587, -0.16937614977359772, 0.1441095620393753, 0.1379358321428299, ...],
...
]
>}
So I have no other ideas, sorry :)
No problem, and thanks again for all the help! I'm going to keep working on this and https://github.com/elixir-nx/nx/issues/1027#issuecomment-2143049605
Have a great day !
💚 💙 💜 💛 ❤️
@christianjgreen I think you still want to set the env var beforehand, export it in your terminal session where you start iex
/livebook
(or in ~/.livebookdesktop.sh
, in case of Livebook Desktop).
Closing this if this is still something we can do to improve it!
I am trying to analyze the optimized HLO outputs for a few
Nx
functions using this flag in a live book:system_env: %{"XLA_FLAGS" => "--xla_dump_to=/tmp/hlo"}
, however no dumps are generated.If there's another preferred way to get the optimized dumps for a function, I'd love to know how the core devs originally checked the output graphs.
Thanks for any help/tips!