Open IhorTsoi opened 3 years ago
@IhorTsoi how many cores does your computer have? 💠There’s a startup and coordination cost to parallelisation which is why most programming languages don’t bother with it. In the case you’ve shared you might not see the benefit because the calculation is very fast. This is more of a software engineering question than an Elixir-specific one.
Recommend reading: http://www.gotw.ca/publications/concurrency-ddj.htm
Hi @nelsonic!
Thank you very much for your response 😃
I have 2 physical + 2 virtual cores.
I should have mentioned another fact in my initial post: the sequential implementation from the example ran 3 times slower than the concurrent implementation from the example.
I'll paste the code from the example here for your convenience:
defmodule Factorial do
def spawn(n) do
1..n
|> Enum.chunk_every(4)
|> Enum.map(fn(list) ->
spawn(Factorial, :_spawn_function, [list])
|> send(self())
receive do
n -> n
end
end)
|> calc_product()
end
def _spawn_function(list) do
receive do
sender ->
product = calc_product(list)
send(sender, product)
end
end
# used on the single process
def calc_product(n) when is_integer(n) do
Enum.reduce(1..n, 1, &(&1 * &2))
end
# used with multiple processes
def calc_product(list) do
Enum.reduce(list, 1, &(&1 * &2))
end
# just a helper function used to time the other functions
def run(f_name, args) do
:timer.tc(Factorial, f_name, args)
|> elem(0) # only displays the time as I didn't want to log numbers that could have thousands of digits
|> IO.inspect(label: "----->")
end
end
So:
reduce
function. One-by-one.Enum.map
it starts a process and waits for it to finish, then the next iteration and so on...). So it looks like the calculations are still done sequentially. (please, correct me if I'm wrong here).From my point of view, even if each chunk could be processed before we hit the receive
macro, so we won't wait at any point of execution, we would still spend some additional time for spawning a process and sending/receiving the messages.
My question is: how can this concurrent implementation still be faster? (3 times faster (!) on my machine) It seems like we can just introduce processes and automatically gain a speed-up😃
P.S. I know that some compilers are smart enough to automatically detect pieces of code that can run in parallel. However, I think that the current example isn't the case. All the response messages should be received only by the receive
macro in the Enum.map
iteration that sent the according request message. Despite the fact that for particular case the one-to-one relationship between the iterations and response messages is not important (we get to sum them anyway, the order is not important), I think that the compiler wouldn't be brave enough to perform such optimization.
The issue is still the fact that the tutorial is incorrect/misleading in this part. This is a section titled "Concurrency" but the processes (one for each chunk) are not running concurrently. Each is spawned and finishes its job before the next one is spawned, as @IhorTsoi observed, because receive
blocks the Enum.map
from progressing to the next item.
The tutorial should IMO be fixed to use the modified implementation proposed by @IhorTsoi
Hi 😊
Thank you very much for a great introduction to Elixir!
I got a little bit stuck at "Concurrency" section. I would be more than happy if you could explain me some details.
The
Factorial.spawn
function is implemented like this:Each chunk is being processed by
Enum.map
function. The processing for each chunk goes like this: 1) create a new process (+ pass a function and the current chunk to be processed) 2) send a message with the main process's PID to newly created process (newly created process receives the message and starts calculations) 3) wait until the process responds with a calculated result for the current chunk.As far as I understand, iterations of
Enum.map
are executed sequentially. On each iteration we spawn a process and then wait until it responds with a calculated result. Thus, it looks like, even though we utilize processes, the calculations are still done sequentially.Initially, I thought that this modified implementation would execute faster:
, because firstly it starts all the calculations and secondly it waits for all of them to finish (not one-by-one).
However, what surprised me, the execution of both versions on my machine took almost the same time. The initial version even executed a little bit faster. I tried experimenting with the input values, but the results were still the same.
This is where I got totally confused 😢
My only guess is perhaps BEAM is able to optimize the execution of
Enum.map
and start the next iteration while it waits for the result fromreceive
macro... But that would be documented in the docs I guess. I haven't found anything like this.Could you please provide me with some information on how this is possible (perhaps some docs or articles), so as I could understand this part better? If I am wrong, please correct me.
Maybe this information would be also useful for other readers.
Thank you!