JuliaParallel / ClusterManagers.jl

Other
232 stars 74 forks source link

Update ElasticManager constructor auto IP config #190

Closed mikeingold closed 1 year ago

mikeingold commented 1 year ago

Ref discussion at JuliaLang/julia#189

This PR updates the ElasticManager constructor to enable support for automatic IP address detection on Windows (and potentially other platforms that aren't macOS or Linux). Prior to this change, a custom get_private_ip() function is used to detect the platform and execute system cmds to look for the machine's IP address. These changes obsolete that function in favor of a call to Sockets.getipaddr(), a utility that was not available until the v1.2 release of Julia.

It's possible for Sockets.getipaddr() to throw an error if no network interfaces are detected, so I've implemented a try-catch to generate a helpful error message, suggesting the IP address be specified manually. This message is largely derived from the prior get_private_ip() error message.

I tested this modified branch on an up-to-date Windows 11 machine and verified that it correctly resolved the local IP address with addr=:auto, where the prior version would simply throw an error that the feature is only supported on Mac/Linux. I've also tested it on an up-to-date Mac Mini M1 and it also correctly identified the local IP address and seems to work fine.

Pros:

Cons:

kescobo commented 1 year ago

As I said in the issue, I'm definitely in favor of this. Presumably anyone in a restrictive with enough environment that they can't update Julia also isn't going to be updating this package to take advantage of this change. If there are no objections, I'll merge at the end of the week (please feel free to bump if/when I forget)

mikeingold commented 1 year ago

Just realized that I should also have updated the README to clarify that this is no longer OS-specific functionality. Working it now...

mikeingold commented 1 year ago

Done.

Moelf commented 1 year ago

btw regarding the interface issue, I think Julia needs to fix this:

kescobo commented 1 year ago

@Moelf Does that affect whether or not to merge this, or just an upstream fix that would make this unnecessary?

Moelf commented 1 year ago

this PR uses getipaddr() but I'm saying that function can return result that is within the range of reserved private IP, and we should probably filter it to make this PR more robust

mikeingold commented 1 year ago

I'm trying to understand the linked issue, but it doesn't seem like I'm running any systems/configurations susceptible to it. The prior solution was a get_private_ip() function that runs an external shell Cmd, e.g. hostname -i in Linux, and appears to simply accept the first IP returned. The Sockets.getipaddr version farms out to Sockets.getipaddrs, which seems to attempt the same thing but by instead using a ccall to libuv(?). It seems like the only stage at which filtering is even possible is within one of these external functions since they return only the single IP. Barring that, the only obvious solutions seem to be: (1) spinning our own variant, which was the status quo, or (2) contribute improvements to the Sockets functions and then merge some version of this PR. Without access to a system that experiences the referenced issue, I'm not sure how much help I can offer with the latter.

Moelf commented 1 year ago

I don't mind merging this PR

mikeingold commented 1 year ago

@kescobo I'm good with moving forward on the merge if no one else objects.