JuliaParallel / ClusterManagers.jl

Other
242 stars 74 forks source link

Make ElasticManager more convenient #136

Closed marius311 closed 4 years ago

marius311 commented 4 years ago

This adds two things discussed in https://github.com/JuliaParallel/ClusterManagers.jl/issues/128

First, it gives ElasticManager the addr=:auto option on Linux, which looks up the master's IP address via hostname --ip-address so that you get the private IP and other workers on the LAN can connect to it. This isn't foolproof depending on the network setup, but its 4 for 4 on the HPC clusters I have access to.

Second, it adds a line to what is printed when you evaluate an ElasticManager,

julia> em = ElasticManager(addr=:auto, port=0)
ElasticManager:
  Active workers : []
  Number of workers to be added  : 0
  Terminated workers : []
  Worker connect command : 
    /home/user/bin/julia --project=/home/user/myproject/Project.toml -e 'using ClusterManagers; ClusterManagers.elastic_worker("4cOSyaYpgSl6BC0C","127.0.1.1",36275)'

So that you can just copy/paste that last line without thinking and get a worker.

Finally, I reworked the readme and removed the part about the deprecated @scheduled which doesn't work anymore, even if you switch to @async.

Hopefully this makes setting up an ElasticManager way more streamlined. Let me know if you there's any feedback, I'm happy to make changes.

Closes https://github.com/JuliaParallel/ClusterManagers.jl/issues/128

marius311 commented 4 years ago

Just a friendly ping for whoever can merge this. I'm happy to make any changes requested.

kescobo commented 4 years ago

I can merge, but don't know if I should merge. I got the write bit to help a bit with tests at some point, but don't feel like I have enough authority to actually merge.

bjarthur commented 4 years ago

yea, me too. would be nice to have other elasticmanager users chime in with comments.

and a test certainly would be nice. could you possibly add one that simply runs the command this PR now prints out to make sure a julia process is actually launched? e,g., no typos in the command string.

marius311 commented 4 years ago

Sounds good, will add a test.

marius311 commented 4 years ago

Ok, added a test which actually runs the command and checks a worker connects. Also figured out how to do it on Mac. Ready from my end!

marius311 commented 4 years ago

Merge?

marius311 commented 4 years ago

Hey guys, could this possibly be merged?

I'm around to do any quick fixes should problems arise, but personally has been working great for me, e.g.:

image

kescobo commented 4 years ago

Since no one else has chimed in, there's a test written, and it seems like actual maintainers of this package don't check in too often, I'm going to go ahead and merge. We can always revert if someone bitterly complains after the fact.

That said, I don't know how I feel about tagging a version given that one of the maintainers did actively object to that in #118