JuliaParallel / Dagger.jl

A framework for out-of-core and parallel execution
Other
629 stars 67 forks source link

Assignable task names for logging purposes #545

Closed m-fila closed 3 weeks ago

m-fila commented 2 months ago

Currently the names of tasks are inferred from the names of scheduled functions or callables through reflexion. In the code I'm experimenting with most of the tasks are closures or some mockups, so they either end up with with names like #242 or they all have the same name. I think that for visualization and debugging it would be very useful to be able to assign programmatically a human readable name

I'd appreciate if something like this was possible:

a = Dagger.@spawn name="taskA" mockup_job(x,y,z)
b = Dagger.@spawn name="taskB" mockup_job(u,v,w)
c = Dagger.@spawn name="taskC" mockup_job(a,b)

so in the logs and visualization the taks could appear as taskA, taskB, taskC instead of mockup_job, mockup_job, mockup_job

jpsamaroo commented 1 month ago

There is already a bit of logic that does this for data (not yet for tasks), in the form of Dagger.logs_annotate!(data, name), which is used by the :graphviz renderer to give names to data nodes in the generated GraphViz graph. I'd be open to adding support for this syntax to @spawn - this logic could be added after https://github.com/JuliaParallel/Dagger.jl/blob/5b5f81651e716120d14b1395c3eef0c9840adebe/src/submission.jl#L227 (just before the DTask is returned) probably. We'd want to probably add a separate logs_annotate! method for DTask so that it can also write t.uid into the logs as the key (maybe we can put it under a :task_annotation category).

We'll potentially also want another log event to be generated around https://github.com/JuliaParallel/Dagger.jl/blob/5b5f81651e716120d14b1395c3eef0c9840adebe/src/submission.jl#L132 that maps from the DTask uid to the Thunk tid, so that renderers can then use that to do the lookup from uid -> tid when filling out task names (as those two IDs are currently different per task). This event can eventually be removed once we unify uids and tids, which I have a local branch for doing.

jpsamaroo commented 1 month ago

I'm going to self-assign this, since I had to build out most of this infrastructure for another PR.