irmen / Pyro5

Pyro 5 - Python remote objects
https://pyro5.readthedocs.io
MIT License
303 stars 36 forks source link

Pyro takes 5 seconds to run the example `greeting_client.py` script #89

Closed bmitc closed 3 months ago

bmitc commented 6 months ago

I am using Python 3.12.1 on both Windows 11 with Pyro5 5.15, and I started out my testing of Pyro5 with the example show here: https://pyro5.readthedocs.io/en/latest/intro.html#simple-example

    import time
    import Pyro5.api

    name = input("What is your name? ").strip()

    greeting_maker = Pyro5.api.Proxy(
        "PYRONAME:example.greeting"
    )  # use name server object lookup uri shortcut

    start_time = time.time()

    print(greeting_maker.get_fortune(name))

    print("Execution time: " + str(time.time() - start_time))
poetry run python -m Pyro5.nameserver
poetry run python .\pyro_test\greeting_server.py
poetry run python .\pyro_test\greeting_client.py

This takes around 5 seconds my relatively powerful machine to run greeting_maker.get_fortune(name). What is going on here? I am investigating Pyro5 to build a distributed system with some fault tolerance, all of the Pyro5 servers would run on the same machine. I am attempting to have a system that would have Erlang/Elixir type of properties but in Python. But this performance has me concerned.

bmitc commented 6 months ago

On the same machine, it takes about 0.01 seconds to run the same remote method call on Ubuntu 22.04 running through WSL2.

irmen commented 6 months ago

Investigate where the time is spent. Sounds to me like a DNS lookup issue that blocks the program. I.e. a configuration issue on your windows installation. (since the ubuntu side just runs it normally).

bmitc commented 6 months ago

What are your suggestions? The time is spent inside Pyro.

What type of configuration issue would cause this? This is a brand new machine and OS install.

irmen commented 6 months ago

"inside Pyro" is extremely vague. Most of what pyro does is calling the OS's network socket functions, for example.

I suspect the locate_ns() call is not finding the name server directly so it reverts to a slower broadcast solution. There are various options to work around this, if you don't want to fix the system's DNS. These are all mentioned in the Pyro documentation. Start here for example https://pyro5.readthedocs.io/en/latest/nameserver.html#configuration-items

You can also enable logging and inspect the trace log file to see where the delay occurs.

irmen commented 6 months ago

Alternatively, you could try another one of the examples such as the "benchmark" one. It doesn't use a name server , so it may tell you if the delay is indeed related to the NS lookup. https://github.com/irmen/Pyro5/tree/master/examples/benchmark

bmitc commented 6 months ago

"inside Pyro" is extremely vague.

Not to a user. I don't know how Pyro works. :) But thank you for the debug suggestions, as I've tried a few of those. (See below.)

Is this an uncommon issue on Windows? One thing I could try is another machine, but I don't have any other issues with networking on this machine, and it's about a two-week-old install of Windows 11. I tested a nameserver instance by Test-NetConnection localhost -port 9090 and this returned quickly (under a second).

I'll also dig into the documentation some more.

I suspect the locate_ns() call is not finding the name server directly so it reverts to a slower broadcast solution.

The documentation says that if the nameserver binds to localhost then the broadcast responder is not started. Is that correct? If so, then that would mean it isn't falling back to the broadcast server. On Windows, I see the message Not starting broadcast server for IPv6. printed out, but on Ubuntu, Not starting broadcast server for localhost. is printed. So it seems IPv6 is being used on Windows and IPv4 on Ubuntu.

Alternatively, you could try another one of the examples such as the "benchmark" one. It doesn't use a name server , so it may tell you if the delay is indeed related to the NS lookup.

I tried the examples by just directly passing in the URI on Windows earlier, and that performance is indeed closer to the Ubuntu side.


With the help of poetry run python -m trace --trace .\pyro_test\greeting_client.py (thanks again for that suggestion), I found that the program is stalling either here:

idna.py(176):             return result, len(input)
<---------------- stalled here ------------------------->
socket.py(964):         af, socktype, proto, canonname, sa = res

or here:

idna.py(176):             return result, len(input)
<---------------- stalled here ------------------------->
core.py(221):                 except socket.error:
core.py(222):                     hosts = [config.NS_HOST]

It moves too fast when it resumes after stalling to see, but I suspect it is from the latter. After inspecting the code in that area of core.py, I saw this:

https://github.com/irmen/Pyro5/blob/5fb01330f3dc2cc78ea29b191d9c572a2effb473/Pyro5/core.py#L219

I tried out

import socket
socket.gethostbyaddr("127.0.1.1")

on both Windows and Ubuntu. On Windows it excepts after a 5 second timeout, and on Ubuntu it returns immediately. I'm guessing that this piece of code is getting hit on Windows but not on Ubuntu.

irmen commented 6 months ago

Pyro itself can also write a trace log file if you set the correct configuration options.

But it seems Windows does something funky related to "localhost" ip resolution. Easiest probably is to not let the name server (and maybe even the other Pyro servers) bind on "localhost" but instead supply the actual host name of the machine to bind on. Or disable the IPV6 usage in a configuration option, if that turns out to be problematic on Windows. All relevant configuration options can either be set on the command line or via environment settings. Look here https://pyro5.readthedocs.io/en/latest/config.html#configuring-pyro

bmitc commented 3 months ago

Thank you for the pointers to the configuration options.