rawpython / remi

Python REMote Interface library. Platform independent. In about 100 Kbytes, perfect for your diet.
Apache License 2.0
3.48k stars 401 forks source link

Dynamic server address:port for the client websocket #508

Open vladkol opened 1 year ago

vladkol commented 1 year ago

In remote access scenarios, the server often is not aware of the URL, IP address and port the users will be accessing it. It happens when the server is running behind a load balancer, dynamically allocated IP address, etc.

This PR adds dynamic_web_address to the start function as well as to the Server class. When dynamic_web_address is True, the javascript client code takes the host address and port from document.location property.

The original behavior stays by default, so the existing code should run as it is.

dddomodossola commented 1 year ago

Hello @vladkol thank you very much for this PR. I'm not sure to understand the use case. Actually the server builds the client page using the socket information retrieved by Server.net_interface_ip method. Once a client connects, the socket info are used to build the page javascript code for websocket connection. So, the client knows where to connect. What is the purpose of your changes? Consider that I don't know exactly what a load balancer is, I read about it, but never used one. So I'm not an expert in this field. Please let me understand.

vladkol commented 1 year ago

Hello @dddomodossola, thank you for looking at my PR, and for an awesome library! The issue with net_interface_ip is that in real-world remote scenarios with server in the cloud , they are rarely exposed by their internal IPs. When the server tries to get its IP, it's normally given some address only available in internal network - one that service nodes use to talk to each other. The server doesn't know which external IP address it's exposed with. Same applies to the port, and happens very often. And even pathname of the URL in the client request may be different from what reaches your server because of URL rewriting.

What users deal with is a load balancing/front door service, such is Cloud CDN in GCP, Cloud Front in AWS, and Front Door in Azure. Depending the host name (and sometimes even the pathname!) of the URL, such service routes the traffic to different internal networks and server nodes. Here is a pretty generic and simple use case: GCP Load Balancing As I mentioned above, the traffic may also be routed differently based on the URL path, as described here. Your requests to google.com/maps and google.com are routed to entirely different server clusters. Same situation with AWS, Azure, Cloudflare, etc.

Those are not some special cases, but recommended ways of exposing services to the Internet. Even when you are hosting a single-node server, the traffic goes through multiple physical and virtual networks. The simplest web app in the cloud normally doesn't have a reserved IP-address, so the hostname of the URL may be resolved to a different IP for different users.

All this situations can be handled by using the actual in-browser URL when connecting to the webscocket. Normally, the client code should never use the IP address to connect to HTTP servers, unless the server doesn't have DNS name at all. But even then, that IP address will be in the URL and document.location.host. However, I tend to assume that some adopters of your library absolutely need to use the IP address. This is why keeping the default behavior as it is may be reasonable, at least for some time.

I came up with this change simply because otherwise remi doesn't work when the server is running in a container in App Engine, Kubernetes Engine or Cloud Run. My previous experience with other clouds tells be the same issue will happen there.

dddomodossola commented 1 year ago

Thank you very much for the explanation. I like your changes 😉

vladkol commented 1 year ago

Great! Let me know if you need anything else before you merge it. Thank you!