Open amitmurthy opened 9 years ago
Can you help me understand the motivation here - how does a 'message' differ from just sending a normal Julia type via RemoteRef?
This is more about clean separation of layers at a lower level in the stack. Nothing changes at the user level, though we may choose to expose the message layer to the user too. My visualization of the stack looks somewhat like this:
---------------------------------------------------
| Topology aware smart iterators, workflow DAGs | - 5
---------------------------------------------------
| Higher level API - @spawn, @parallel, pmap etc | - 4
---------------------------------------------------
| Remote function call API | Data transfer API | - 3
---------------------------------------------------
| Message API | - 2
----------------------------------------------------
| Byte streams (TCP/IP) | MPI | 0MQ | Others | - 1
---------------------------------------------------
Starting from the top:
5 is what folks in @alanedelman 's research group that I spoken with have stated as the goal at one time or another.
4 is what folks typically use to get parallelism in their code.
In 3, the API we have is one of remote function execution via the remotecall*
functions, and data transfer via put!
, take
, etc., which all internally use remotecall
today. I think there is scope to optimize the data transfer APIs to set/get data directly, bypassing the need for closure serialization / function calling as currently done.
Currently 1 & 2 are sort of combined and not user accessible. We should expose an an API along the lines of
send(pid::Int, msg_type::Symbol, msg::Array{UInt8, 1})
to just transport the block of bytes, msg
to worker pid
where it handled depending on msg_type
.
Handlers could even be user defined. Exposing something at this level just provides users with greater flexibility if desired - most folks don't need this, but for those who do, it is possible. Also, becomes easier to support multiple transports for the transfer of msg
- we don't care how it is transferred to worker pid
- could be via MPI or 0MQ or the TCP transport in Base.
+:100: – I really like this vision. I'm also not certain that our current RemoteRef model is right for 3.
This is perfect. I am using this as the Julia parallel roadmap when anyone asks.
Not really relevant to Base anymore, since this would be Distributed.jl related, or other parallelism/messaging packages
Starting a discussion to cleanly specify a Julia message. This will help in
A Julia message header could include