haskell-distributed / distributed-process

Cloud Haskell core libraries
http://haskell-distributed.github.io
711 stars 96 forks source link

whereisRemoteAsync does not work via LAN #346

Closed jgotoh closed 4 years ago

jgotoh commented 4 years ago

Hi,

I have a problem searching for remote processes via LAN, I created a small project [1] demonstrating the problem.

Basically whereisRemoteAsync does not work for me:

I am first creating a process at a specified ip and port:

cabal new-run exes -- --startip IP --startp PORT

where IP is replaced by the LAN address I got by running hostname -I, PORT is just some port.

This command starts a Process, see [2], function startProcess, that registers itself in the local process registry, and prints the address of the created Node, e.g IP:PORT:0. After that, the process waits for a message to receive.

After that, on another computer, I use: cabal new-run exes -- --ip ANOTHER_IP --p PORT --j IP:PORT:0

Again, ANOTHER_IP is taken from hostname -I, PORT is a random open port. IP:PORT:0 is the address of the node the first process printed.

Function joinProcess in [2] then tries to join the started process by calling whereIsRemoteAsync, but it only works when i am running both processes on the same computer.

The strange thing here is, that I can successfully connect to the other Endpoint when using the "raw" functions from the Network.Transport package. That means, I can establish a Connection, and use its send function. The other node successfully receives the message.

If you want to try, run the command from above with the --raw flag added:

cabal new-run exes -- --ip ANOTHER_IP --p PORT --j IP:PORT:0 --raw

Because I can establish the connection, I don't think it is some kind of firewall problem. (I even can establish a raw TCP connection and send stuff to the first Endpoint using the netcat tool via nc IP PORT).

Also running netstat -a -n shows that the port is actually open and listening for connections at the correct LAN ip.

Do you have any advice for me regarding this problem? I don't have any idea why it does not work and how I can get the ProcessId right now.

Greetings, Julian

[1] https://github.com/jgotoh/ping-pong [2] https://github.com/jgotoh/ping-pong/blob/master/Main.hs

jgotoh commented 4 years ago

Okay, I actually solved the problem myself: Instead of depending on the version of distributed-process on Hackage, I just use the current master branch revision.