nv-legate / legate.core

The Foundation for All Legate Libraries
https://docs.nvidia.com/legate/24.06/
Apache License 2.0
186 stars 61 forks source link

[BUG] conda package missing legion_prof #950

Open syamajala opened 2 months ago

syamajala commented 2 months ago

Software versions

The legate 24.06.00 conda package is missing legion_prof

Jupyter notebook / Jupyter Lab version

No response

Expected behavior

The conda packages should include legion_prof

Observed behavior

legion_prof is missing

Example code or instructions

Install legate 24.06.00 from conda.

Stack traceback or browser console output

No response

marcinz commented 1 month ago

To get the profiler, we would need to configure with --legion-rust-profiler. This would increase the footprint of the package by requiring rust. @m3vaz, I am thinking that we should add this as a separate package output to our recipe.

syamajala commented 1 month ago

Do you know what legion commit was used to build legate 24.06.00?

I can just install the profiler myself for now, but when I tried the latest legion master the profiler was crashing when I tried to generate a profile from a log generated with legate 24.06.00.

marcinz commented 1 month ago

@syamajala v24.06.00 was built of 75074815f2cc063bd38f78901a7538a06012fe43.

@m3vaz, I guess for the time being we could add the profiler to the package and worry about splitting later. We would need to add a rust dependency, but that should probably be enough.

syamajala commented 1 month ago

Thanks! Im able to view the profiles after checking out that commit of legion.

manopapad commented 1 month ago

Note that adding the rust profiler unconditionally to all CI builds will result in an uncachable rust build (at least when using ccache, which doesn't support rust) (sccache appears to support rust).

Also I have suggested that the build of the rust profiler happen in a separate step, rather than as part of the main build, otherwise we hit https://github.com/nv-legate/legate.core/issues/860. See internal work item LLRDO-176.

syamajala commented 1 month ago

I am only able to use legion_prof --view trying to generate an archive crashes.

syamajala commented 1 month ago

stack trace for legion_prof --archive crash:

Reading log file "legate_0.prof"...
Matched 98916 objects
No Legion Spy data, skipping postprocess step
Sorting time ranges
Created output directory "legion_prof.2"
Writing level 0 with 1 tiles
thread '<unnamed>' panicked at src/backend/data_source.rs:1137:67:
called `Option::unwrap()` on a `None` value
stack backtrace:
thread '<unnamed>' panicked at src/backend/data_source.rs:1137:67:
called `Option::unwrap()` on a `None` value
   0: rust_begin_unwind
             at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:645:5
   1: core::panicking::panic_fmt
             at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:72:14
   2: core::panicking::panic
             at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:144:5
   3: <legion_prof::backend::data_source::StateDataSource as legion_prof_viewer::data::DataSource>::fetch_slot_meta_tile
   4: std::panicking::try
   5: rayon_core::registry::Registry::catch_unwind
   6: <rayon_core::job::HeapJob<BODY> as rayon_core::job::Job>::execute
   7: rayon_core::registry::WorkerThread::wait_until_cold
   8: rayon_core::registry::ThreadBuilder::run
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
stack backtrace:
Rayon: detected unexpected panic; aborting
[1]    607787 IOT instruction (core dumped)  RUST_BACKTRACE=1 legion_prof --archive legate_0.prof
syamajala commented 1 week ago

24.06.01 does not seem to have the profiler either?

syamajala commented 1 week ago

what commit of legion was 24.06.01 built with? i tried the latest legion master but the profiler is crashing when i try to view cunumeric profile logs.

manopapad commented 1 week ago

It used Legion commit a66da82b8fb1fb45d3605963cb33bced51da2f6e