It4innovations / hyperqueue

Scheduler for sub-node tasks for HPC systems with batch scheduling
https://it4innovations.github.io/hyperqueue
MIT License
272 stars 21 forks source link

Add option to connect via IPv4 #539

Closed ghuls closed 1 year ago

ghuls commented 1 year ago

Hostnames in our cluster have IPv4 and IPv6 addresses, but the IPv6 addresses are not reachable from other nodes.

It would be great if it was possible to force to connect via IPv4 if requested.

ghuls commented 1 year ago

Just to add.

A local worker on the same host as the server, can connect. On worker on another machine can't connect to the IPv6 address, but can connect if the server was started with the IPv4 address instead of a hostname.

Kobzol commented 1 year ago

Hi. I see several ways of resolving this.

1) Currently, the server starts to listen basically on IPv4 0.0.0.0:<port>. We could add the option for the server to listen on IPv6 instead, like hq server start --ip-version=6. 2) Currently, the worker connects to the first address returned by the DNS lookup of the server's hostname, which is, in retrospect, incorrect, because on your cluster the first address is probably IPv6. I changed the code in https://github.com/It4innovations/hyperqueue/pull/541 so that it tries all the addresses. This will only work if your DNS can actually resolve to the IPv4 address of the server though.

I will send you a link to binary that you can try after the CI on that PR completes.

Kobzol commented 1 year ago

You can try the modified binary here.

ghuls commented 1 year ago

2. This will only work if your DNS can actually resolve to the IPv4 address of the server though.

The DNS server we use has both IPv6 and IPv4 addresses for the host server.

https://github.com/It4innovations/hyperqueue/pull/541 fixes the problem for me indeed.

Kobzol commented 1 year ago

Great, thanks for confirming it! The change will appear in the next stable release, you can use the provided binary in the meantime.