irmen / Pyro4

Pyro 4.x - Python remote objects
http://pyro4.readthedocs.io/
MIT License
713 stars 83 forks source link

Shutdown a nameserver in Pyro4 without manual interruption #247

Closed abodh closed 1 year ago

abodh commented 1 year ago

I know the current development is in Pyro5, but I am using another package (Pyomo) that is still dependent on Pyro 4. For some reason, I cannot terminate the name server while submitting a slurm request. I have a nameserver as one process, a dispatch server as another, and a few more processes for the solver server. I can get the result, and everything is working fine for me. However, it is annoying that all of the solver servers and dispatch server gets terminated at the end, but the nameserver does not. This is okay if I run it via a terminal where I can force the nameserver shutdown using Ctrl + C, but how to enforce this when submitting a job request via slurm?

Dispatch server can be terminated using: proxy.shutdown() where proxy = Pyro. proxy("URI of dispatch server") followed by proxy._pyroRelease(). However, the same process does not work for the nameserver, and the job halts at that point until I cancel the job using scancel. This is hindering me from submitting multiple job arrays, as the nameserver is not terminated without manual interruption.

Is there a way to handle this?

I tried the following:

ns = Pyro.locateNS() 
uri = ns.lookup("Pyro.NameServer")
proxy = Pyro.Proxy(uri)

proxy.shutdown() # also tried proxy._shutdown() and proxy._pyroshutdown()
proxy._pyroRelease()

But the error pops up as I figured out proxy does not have any of those methods to shut down a server.

irmen commented 1 year ago

It's by design that you cannot programmatically shut down a Pyro name server via a proxy connection to it! That would cause havoc as a running name server is a critical component .

Why is it that you even have to shut down the name server? Just keep it running...

abodh commented 1 year ago

Yes, I could keep it running, but I am submitting the job on the HPC cluster via slurm and the job would not end until the name servers are terminated. So even though I can export the results, I could not terminate the job and would have to cancel it manually. The institution has only assigned a few nodes for my project, so this is kind of hindering me from automating multiple job submissions.

As a workaround, for now, I just put a random error at the end of my code to at least terminate the job on the cluster but I thought there would be a better way to do this (like terminating the nameserver). Anyway, thank you for responding back. I really appreciate that :)

irmen commented 1 year ago

I don't know what Pyomo or Slurm is (other than a Futurama reference). Maybe it's better to ask these questions over at their project? Maybe they have some management mechanism that allows some control over the Pyro servers?

Alternatively, maybe you can investigate if the name server is required at all, perhaps you can "hardcode" the Pyro server uris rather than relying on the name server to look them up?

abodh commented 1 year ago

I will try the options that you have provided. Thank you again!