JuliaParallel / Dagger.jl

A framework for out-of-core and parallel execution
Other
638 stars 67 forks source link

add documentation for Task / Thread occupancy #549

Closed wolthom closed 2 months ago

wolthom commented 3 months ago

This PR adds a section to the Task Spawning documentation page to explain the behavior observed in #548.

wolthom commented 3 months ago

@jpsamaroo I think I may have an idea:

Did you run the make.jl script from the commandline? E.g. via julia --threads=12 --project=. make.jl?

For some reason that hangs for me as well. However, if I start a REPL, activate the docs environment and include("make.jl"), it builds normally for me.

That also explains why the Documentation CI step timed out. (Just realized that).

Could this potentially be an interaction between the Dagger scheduler / the main thread of the REPL / ...?

jpsamaroo commented 3 months ago

Yeah I ran it from the CLI as well. I'm running with the JULIA_DEBUG=Dagger env var set, but nothing prints - this makes me think that either Documenter is filtering it out. I forced it in Dagger directly, yet it still doesn't show anything, so probably it's being redirected into a file/pipe for output. This makes things particularly hard to debug... maybe I'll need to implement a debug helper that sends output to a file or something?

wolthom commented 3 months ago

Interestingly enough, I do get output when running from the CLI with JULIA_DEBUG=Dagger: image

When enabling the debug messages, it does not hang anymore though. Instead it actually fails with an exception in the example.

If this is specific to this environment / setup and not some underlying actual bug, I could also turn this from executed code to "dumb" code snippets as a workaround.

jpsamaroo commented 3 months ago

@wolthom yeah let's do that for now - I don't have a whole lot of time to investigate this right now, and I don't want to block adding this information to the docs, so let's just use "dumb" code blocks, and maybe you can just include the result from running on your local system?

wolthom commented 3 months ago

@jpsamaroo I have updated the task-spawning.md note as discussed above. Now the code blocks should no longer be executed as part of the documentation build (and thus not hang).

Could you verify if this also works as expected for you?

jpsamaroo commented 2 months ago

Worked great, thank you!