JuliaLang / Distributed.jl

Create and control multiple Julia processes remotely for distributed computing. Ships as a Julia stdlib.
https://docs.julialang.org/en/v1/stdlib/Distributed/
MIT License
20 stars 8 forks source link
distributed-computing julia parallel-computing

Distributed

The Distributed package provides functionality for creating and controlling multiple Julia processes remotely, and for performing distributed and parallel computing. It uses network sockets or other supported interfaces to communicate between Julia processes, and relies on Julia's Serialization stdlib package to transform Julia objects into a format that can be transferred between processes efficiently. It provides a full set of utilities to create and destroy new Julia processes and add them to a "cluster" (a collection of Julia processes connected together), as well as functions to perform Remote Procedure Calls (RPC) between the processes within a cluster. See API for details.

This package ships as part of the Julia stdlib.

Using development versions of this package

To use a newer version of this package, you need to build Julia from scratch. The build process is the same as any other build except that you need to change the commit used in stdlib/Distributed.version.

It's also possible to load a development version of the package using the trick used in the Section named "Using the development version of Pkg.jl" in the Pkg.jl repo, but the capabilities are limited as all other packages will depend on the stdlib version of the package and will not work with the modified package.

API

The public API of Distributed consists of a variety of functions for various tasks; for creating and destroying processes within a cluster:

For controlling other processes via RPC:

For communicating between processes in the style of a channel or stream:

For controlling multiple processes at once:

Process Identifiers

Julia processes connected with Distributed are all assigned a cluster-unique Int identifier, starting from 1. The first Julia process within a cluster is given ID 1, while other processes added via addprocs get incrementing IDs (2, 3, etc.). Functions and macros which communicate from one process to another usually take one or more identifiers to determine which process they target - for example, remotecall_fetch(myid, 2) calls myid() on process 2.

!!! note Only process 1 (often called the "head", "primary", or "master") may add or remove processes, and manages the rest of the cluster. Other processes (called "workers" or "worker processes") may still call functions on each other and send and receive data, but addprocs/rmprocs on worker processes will fail with an error.