locka99 / opcua

A client and server implementation of the OPC UA specification written in Rust
Mozilla Public License 2.0
515 stars 140 forks source link

Async server #366

Open einarmo opened 4 months ago

einarmo commented 4 months ago

This PR being colossal, consider this an announcement that something like this exists, and not really a suggestion we go ahead and merge this right away. I'm interested in figuring out a way we can make this happen, because I think it is a good first step towards a full Rust implementation that takes a step beyond embedded.

This is obviously a lot of code, but I found splitting this up to be very hard. There are perhaps ways I could do part of it, but not without creating a really unpleasant intermediary state where we maintain both the old server structure and the new server structure at the same time, as well as a constantly evolving interface between the two. If needed we can figure out a way to do this in smaller chunks, but for now it stands as a single PR.

What and Why

This is a rewrite of the server from scratch, with the primary goal of taking the server implementation from a limited, mostly embedded server, to a fully fledged, general purpose server SDK. The old way of using the server does still more or less exist, see samples for the closest current approximation, but the server framework has changed drastically internally, and the new design opens the door for making far more complex and powerful OPC-UA servers.

Goals

Currently my PC uses about ~1% CPU in release mode running the demo server, which updates 1000 variables once a second. This isn't bad, but I want this SDK to be able to handle servers with millions of nodes. In practice this means several things:

High level changes

First of all, there are some fundamental structural changes to better handle multiple clients and ensure compliance with the OPC-UA standard. Each TCP connection now runs in a tokio task, and most requests will actually spawn a task themselves. This is reasonably similar to how frameworks like axum handle web requests.

Subscriptions and sessions are now stored centrally, which allows us to implement TransferSubscriptions and properly handle subscriptions outliving their session as they are supposed to in OPC-UA. I think technically you can run multiple sessions on a single connection now, though I have no way to test this at the moment.

The web server is gone. It could have remained, but I think it deserves a rethink. It would be better (IMO), and deal with issues such as #291, if we integrate with the metrics library, and optionally export some other metrics using some other generic interface. In general I think OPC-UA is plenty complicated enough without extending it with tangentially related features, though again this might be related to the shift I'm trying to create here from a specialized embedded server SDK, to a generic OPC-UA SDK.

Events are greatly changed, and quite unfinished. I believe a solid event implementation requires not just more thought, but a proper derive macro to make implementing them tolerable. The old approach relied on storing events as nodes, which works, and has some advantages, but it's not particularly efficient, and required setting a number of actually superfluous values, i.e. setting the displayname of an event, which is a value that cannot be accessed, as I understand it. The new approach is just storing them as structs, dyn Event.

Node managers

The largest change is in how most services work. The server now contains a list of NodeManagers, an idea stolen from the .NET reference SDK, though I've gone further than they do there. Each node manager implements services for a collection of nodes, typically the nodes from one or more namespaces. When a request arrives we give each node manager the request items that belongs to it, so when we call Read, for example, a node manager will get the ReadValueIds where the NodeManager method owns_node returns true.

There are some exceptions, notably the view services can often involve requests that cross node-manager boundaries. Even with this, the idea is that this complexity is hidden from the user.

Implementing a node manager from scratch is challenging, see node_manager/memory/diagnostics.rs for an example of a node manager with very limited scope (but one where the visible nodes are dynamic!).

To make it easier for users to develop their own servers, we provide them with a few partially implemented node managers that can be extended:

More node managers can absolutely be added if we find good abstractions, but these are solid enough to let us implement what we need for the time being.

Lost features

Some features are lost, some forever, others until we get around to reimplementing them. I could have held off on this PR until they were all ready, but it's already large enough.

General improvements

Integration tests are moved into the library as cargo integration tests, and they are quite nice. I can run cargo test in about ~30 seconds, most of which is spent on some expensive crypto methods. There is a test harness that allows you to spin up a server using port 0, meaning that you get dynamically assigned a port, which means we can run tests in parallel arbitrarily.

This almost certainly fixes #359, #358, #324, #291, and #281, and probably more.

Future work

See todo.md, the loose ends mentioned in this PR description need to be tied up, and there is a whole lot of other stuff in that file that would be nice to do.

AiyionPrime commented 4 months ago

This is huge :smile:, not only in lines.

[...] consider this an announcement that something like this exists, and not really a suggestion we go ahead and merge this right away [...]

How do you feel about marking it as a draft until you'd suggest that?

einarmo commented 4 months ago

This is huge 😄, not only in lines.

[...] consider this an announcement that something like this exists, and not really a suggestion we go ahead and merge this right away [...]

How do you feel about marking it as a draft until you'd suggest that?

It's done, I won't touch it unless there are things that need changing. I suppose I am suggesting we merge it, kind of, I'm just saying I am very open for figuring out a way to make it more manageable.

locka99 commented 4 months ago

Sorry I'm under time pressure at the moment. If there are discrete chunks of it that can be split out then it might be easier to look at. I should have time to have a broad look in the next few days.