JuliaLang / Distributed.jl

Create and control multiple Julia processes remotely for distributed computing. Ships as a Julia stdlib.
https://docs.julialang.org/en/v1/stdlib/Distributed/
MIT License
20 stars 8 forks source link

Moar threadsafe moar better #101

Open JamesWrigley opened 1 month ago

JamesWrigley commented 1 month ago

This is a rebased version of #4, it should be ready to merge. Fixes #73.

(made after discussing with @jpsamaroo)

CC @vchuravy, @vtjnash

JamesWrigley commented 1 month ago

I tracked the test failures down to: https://github.com/JuliaLang/julia/pull/53326 The workers are loading the builtin version of Distributed from the sysimg while the master is using the development version, and because we changed the definition of the Worker struct in this PR we get a serialization error when doing a remotecall to another worker because the master and worker are using different types.

I think this should be fixed in Base, made a PR here: https://github.com/JuliaLang/julia/pull/54571 That should be merged before this one.

codecov[bot] commented 1 month ago

Codecov Report

Attention: Patch coverage is 94.11765% with 2 lines in your changes missing coverage. Please review.

Project coverage is 78.87%. Comparing base (6a07d98) to head (ec8bce0). Report is 4 commits behind head on master.

:exclamation: Current head ec8bce0 differs from pull request most recent head fa9d645

Please upload reports for the commit fa9d645 to get more accurate results.

Files Patch % Lines
src/cluster.jl 96.96% 1 Missing :warning:
src/managers.jl 0.00% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #101 +/- ## ========================================== + Coverage 71.69% 78.87% +7.18% ========================================== Files 10 10 Lines 1950 1908 -42 ========================================== + Hits 1398 1505 +107 + Misses 552 403 -149 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

JamesWrigley commented 1 month ago

Alrighty, after a force push to trigger CI with latest nightly we are back in the green :partying_face: I think this is ready to be merged now.

JamesWrigley commented 1 month ago

Will this be compatible to or backported, if necessary, to the next Julia LTS?

I'll leave that for someone more qualified to properly answer, but FWIW if 1.11 is chosen as the next LTS then it'll be possible to upgrade Distributed now that it's an excised stdlib :octopus:

JamesWrigley commented 1 month ago

One other thing I noticed is that this should probably be using Threads.@spawn everywhere instead of @async, but I'll leave that for another PR.

JamesWrigley commented 2 weeks ago

I believe @jblaschke was going to have a look at it. In the meantime I see the branch is out of date, so I'll rebase it.

JBlaschke commented 2 weeks ago

@JamesWrigley @jonas-schulze I'll be working on this this week. Just getting back up to speed after long travel...