Support multiple nodes (i.e. grid)

stevepryde commented 4 years ago

One feature that was always in the plan was to support running multiple nodes, possibly on different hosts, and be able to forward requests over the network.

The basic idea is pretty straightforward. Xenon itself, being a proxy, can act as a WebDriver server. Therefore, if we want to forward requests to another instance of Xenon, rather than chromedriver etc, this should "just work" if we wire up the Xenon nodes as if they were just another session. Further, we don't need to worry about managing the number of browsers per node because Xenon already does this. If Xenon can expose the list of browsers it has, then the upstream Xenon instance (i.e. the hub) can just match on those and if matched, try to create a session. If it fails, move on to the next match. If all fail, return NoMatchingBrowsers etc.

There are a couple of ways to do the configuration. One way is to put the node configuration in the hub config file. There are advantages to this approach, one being that it's easier to figure out where a request might go. Just follow the chain. However, there are downsides too, mostly around what happens if a node becomes unreachable.

The other way is to put the hub address in each node's configuration. This seems better, because in theory you could have multiple nodes all with the same configuration. However, the hub also needs to know the address of each node, so that it can forward requests there. It's probably possible to use the remote address of each node when they register with the hub, but i don't know how reliable this would be across all networks. To start with, I'm thinking it's probably ok just to have each node tell the hub its address and port explicitly. Also, if a node goes down, it can just register again when it starts back up.

There is also no difference between a hub and a node. I'm designing this so that any individual Xenon server could have its own local sessions and also remote sessions offered by other nodes. You could also nest the nodes several layers deep, however there are performance penalties (network latency for every stacked request) for doing this. It might also be possible to end up with a circular route which would cause issues. I'll tackle this and other related complexities once the basic design is working.

I've already started on this - check out the nodes branch if you want to follow along. Feel free to offer advice if you think the design could be improved or if there's a better approach. This is not currently functional but it's getting close.

stevepryde commented 4 years ago

Currently the plan is for nodes to use a simple REST request to register and deregister, as well as to periodically update every time the number of available sessions changes.

However, I'd like to investigate using websockets for this, primarily because the server would then get instant feedback when a node has dropped out, and vice versa. If the websocket itself drops out but both servers stay running, it should be possible to re-register and carry on without losing the session.

Once I get the REST side working I'll look into using websockets (if it's easy to reuse the register/deregister code then I'll look at making this optional and support both). Regardless, the actual WebDriver sessions will always use REST. The websockets would only be for nodes to register/deregister/update.

stevepryde commented 4 years ago

Small update:

For the initial implementation, configuration will only be done on the server end, and it will simply point to downstream nodes it can use. The downstream nodes will be unaware that the caller is also a Xenon instance.

Later, I'll implement a websocket connection between them, which will allow the downstream node to update its parent regarding how many sessions it has available.

This design also allows for multiple "hubs" to point to the same node, or pool of nodes. This could be useful in some cases.

One major benefit with this approach is that you don't need both nodes to know each others ip/port. The hub only needs to know the network address(es) of the node(s), and these can go in the configuration as a list of urls. The node needs no special configuration in order to be used as a node.

stevepryde commented 4 years ago

This is now working!

Currently the configuration is static, loaded only on start up. Later in a future update this will update via websocket messages so that any Xenon server will instantly know when anything changes regarding which browsers a downstream node can service, and it can instantly relay that information upstream as well.

For now, configuration is only loaded on startup, and any downstream nodes will be polled until their configuration is read successfully, and then they will not be updated after that. This is sufficient to wire everything up, and I have tested this with 2 servers and even 3 servers in a chain. Everything is working like a charm!

I am thinking of doing more testing in this state, and then merging this to the main branch before beginning work on websockets.

Help with testing is always welcome.

To test out this feature, all you have to do is add a nodes: section to the config as follows:

nodes:
  - name: some_node_name
    url: localhost:8888

Then run a server pointing at this config. Then run another server on port 8888 with different config (with just browser config, no nodes) Now you can point your selenium test at the original server and your requests will be forwarded to the server on port 8888. I used localhost above as a demonstration but this other server could sit anywhere on your network that is reachable from the first server.

stevepryde commented 4 years ago

@zhiburt you may be interested in this one :) Let me know if you have any feedback.

(this is on the nodes branch)

stevepryde commented 4 years ago

Merged in nodes branch as v0.4.0

stevepryde / xenon

Support multiple nodes (i.e. grid) #11