Unifying the TLS libraries between WS, QUIC and Fetch/HTTPS

Specification

The current situation in Polykey with TLS will involve lots of TLS libraries. This is not as secure as it can be. It's better to centralise the TLS libraries to 1 BoringSSL library. This simplifies how we expect the TLS system to operate, such as dependencies on operating system CA certificates, and having to only update 1 TLS library for PK and monitoring security vulnerabilities to that TLS library, and being able to independently update that TLS library without updating the Node runtime... etc.

TLS Situation in Polykey excalidraw

This requires:

A combined websocket client and server implementation that is portable. This could mean a fork of UWS to include a client, or a new WS implementation in rust. Portability in this respect is important since we need to deal with sockets and such. One way to deal with this is similar to the quiche library... if the websocket library can be purely functional without any IO, and then allow an external system to do the actual IO, it could work.
The boringssl library has to be extracted out as its own shared object. This has to be loaded into Node.js and then somehow found by the other native addons like the QUIC based on quiche.
Because fetch and https related modules rely on Node's tls module, instead of replacing fetch and https to use BoringSSL, we could override the tls module with a custom tls module that has the same API but instead uses the boringssl code.

To make 1. possible, this would mean that websockets can be generic to the underlying socket IO, that makes it similar to quiche, and the whole thing can be just written in JS. In fact if a websocket library was generic to the underlying socket IO and underlying crypto library, that would be best.

To make 2. possible, I'm not even sure if this is possible. Node's native addons seem to be all designed to be statically linked objects that only dynamically link to object code that already exists in node's executable. There's no documentation on how a native addon can dynamically link to another shared object. Or how 2 native addons could share a common native library.

To make 3. possible, this primarily deals with the fact that we don't have a generic HTTP/HTTPS library that is generic to the underlying socket IO. It's also the fact that other mobile platforms may implement fetch but with different underlying systems. So I'm not really sure here.

One possibility is to look at Electron's node. https://www.electronjs.org/blog/electron-internals-using-node-as-a-library

They have managed to compile nodejs as a shared object, and then swap out its underlying openssl to boringssl. There might be more flexibility if we can copy how electron project builds nodejs to use... and maybe we will have a better way of bundling it as well in pkg. Doing so will however change how we expect to test things if node is not what we do to run PK, but instead our own custom node.

Additional context

422
155
https://github.com/nodejs/node/issues/25890#issuecomment-1560424895 - this indicates the possibility of using nodejs as a shared object which apparently has some flexibility - WATCH this thread!
503
234
https://github.com/uNetworking/uWebSockets.js/issues/877
https://github.com/uNetworking/uWebSockets.js/discussions/804
https://github.com/MatrixAI/js-db/issues/5 - discussion of how to dependency inject native crypto operations into the block encryption of the RocksDB, similar things have to be done in order to unify the TLS operations for WS, QUIC and Fetch/HTTPS (note that WS would need to have a runtimeless style like QUIC, while QUIC's usage of boringssl would need to be DIable somehow)

Tasks

I think the most promising route to achieve this is to look at Electron's build of nodejs. And try to bring that into our build system. It will be important to then align the build of both the Polykey-CLI and Polykey-Desktop along the same node runtime if we want to do this.

@tegefaulkes another interesting idea is to create WS library like how quiche works, and write all the WS stuff in JS, and abstract out to allow generic IO and generic TLS/crypto.

Some notes on how to actually combine Node native plugins together: https://chat.openai.com/share/d09826e1-ebb0-4584-9e89-d379ac7363b8

The trick is like this:

Use require to load the native libraries in-order. They are dynamically loaded.
Use the -l to dynamically link the first library to the second library. This ensures that the second library's code will be calling to symbols that are exported by the first library.
This way when inside nodejs, there won't be a double loading of the same library. Once you require the first library, its symbols will be available to the second library.
In order for this work the libraries must have a well understood path to avoid conflicts. Apparently dlopen will ask the dynamic linker if the same path has already been loaded. This may be challenging. Not sure how to solve this yet.

> To create a second native binding which relies on a function defined in the first library, you need to follow these steps: > > **Step 1: Expose the headers of the first library to the second library** > > Headers from the first library must be included in the second library. You will do this in the second library's C code. You can add `#include` statements in the C files of your second library that point to the header files of your first library. > > For example, if you have a function in your first library defined in "first_lib.h" and implemented in "first_lib.c", you would include this header file in your second library's C files as follows: > > ```c > #include "path/to/first_lib.h" > ``` > > Here, the path should be relative to the second library's C files or an absolute path. > > **Step 2: Link the first library when compiling the second library** > > When compiling the second library, you need to ensure that it's linked against the first library. This is typically done using the `-l` option followed by the name of the first library, assuming that it can be found in your system's standard library paths. > > If it's not in the standard paths, you will also need to use the `-L` option followed by the directory path of the first library to tell the linker where to find it. > > The following is an example of how you might compile the second library: > > ```bash > gcc -o second_lib.o -c second_lib.c > gcc -shared -o second_lib.so second_lib.o -L/path/to/first/library -lfirst_library > ``` > > Here, "first_library" is the name of the first library and "/path/to/first/library" is its directory path. Ensure that the first library's shared object file is in the specified directory and has the appropriate name (e.g., "libfirst_library.so"). > > **Step 3: Check for potential issues** > > Function calls should work as expected when you load the second object into Node.js, assuming that the first library is correctly compiled and linked. However, if the first library is also loaded into Node.js, you may run into symbol conflicts. These issues can often be avoided by using unique names for your C functions and variables. > > It's also important to make sure that the first library's shared object file is accessible to the second library at runtime. If the first library's .so file isn't in a standard system library path, you may need to add its location to the `LD_LIBRARY_PATH` environment variable: > > ```bash > export LD_LIBRARY_PATH=/path/to/first/library:$LD_LIBRARY_PATH > ``` > > Keep in mind that changes to `LD_LIBRARY_PATH` are only in effect for the current shell session and its child processes. You may want to add this command to your shell's initialization file (like `.bashrc` or `.bash_profile`) if you need it to be in effect across all shell sessions. > > Remember to follow the Node.js native addon documentation carefully, and use tools like node-gyp to help manage the complexity of building and linking your C libraries. > Yes, the dynamic linker keeps track of loaded libraries using their full path names as unique identifiers. When a request is made to load a shared library, the dynamic linker checks if a library with the same path is already loaded into the process's address space. > > On Unix-like systems, including Linux, this behavior is part of the dynamic linker's handling of shared libraries. The dynamic linker, often invoked via the `ld.so` or `ld-linux.so` command, is responsible for loading shared libraries into a process at runtime and resolving symbols (function and variable names) between them. > > When you use the `dlopen` function to dynamically load a shared library, as Node.js does when you use `require` to load a native module, `dlopen` first asks the dynamic linker to check if the library is already loaded in the current process. > > The dynamic linker maintains a list of currently loaded libraries, and each library in this list includes the full path to the library file. If a library with the same path is already loaded, `dlopen` simply returns a handle to the already loaded library and increases its reference count. > > This mechanism ensures that, even if different parts of a program try to load the same shared library, only one copy of the library is loaded into memory. This behavior saves memory and allows the shared library to maintain a single shared state across the entire process. > Yes, when referring to the full path, it typically means the absolute path from the root of the filesystem. > > If the same library is located in two different paths, and both are loaded separately, the dynamic linker will treat them as distinct libraries because they have different absolute paths, even though the actual content of the libraries may be identical. As a result, each library would be loaded separately, each with its own distinct set of symbols and state. > > So, indeed, there could be potential conflicts or unexpected behavior. For example, if both libraries define a global variable with the same name, there would now be two separate instances of that global variable in the process's memory. Any changes made to the variable in one library wouldn't be reflected in the other library, even though you might expect them to be the same. > > In general, it's best to avoid this situation if possible. It's good practice to ensure that each shared library in a system has a unique absolute path. If a library needs to be installed in multiple places for some reason, it's usually best to use symbolic links to point to a single canonical location, rather than having multiple separate copies of the library. This way, regardless of which path is used to load the library, the dynamic linker will resolve it to the same absolute path and load the same instance of the library. > Your understanding is correct -- if the first library is loaded using `require`, its symbols (functions, variables, etc.) will be available for the second library to use, as long as the second library includes the correct headers and is linked correctly. > > Regarding the `-l` flag in the compilation step, it's used to specify libraries against which to link. However, this assumes that you have access to the library file (the `.so` file, for instance) at the time of compiling your second library. This would be straightforward if you're compiling everything on the same machine and the first library is installed somewhere your compiler can access it. > > In a Node.js/npm context, things are a bit more complicated. The actual `.so` files might not be available on the machine where you're compiling the second library, particularly if the first library is an npm package that's distributed as precompiled binaries. > > There are a couple of ways you might handle this: > > 1. **Distribute the first library's headers with the npm package and use dynamic linking**: In this case, you would include the first library's headers in its npm package, and the second library would include these headers at compile time. However, you wouldn't actually link against the first library at compile time. Instead, the second library would use dynamic linking (via `dlopen` or a similar mechanism) to access the first library's symbols at runtime. This means the second library would need to handle the possibility that the first library's symbols might not be available. > > 2. **Compile the second library on the end user's machine using node-gyp**: In this case, you would set up your second library's npm package to compile its source code as part of the npm install process. You would specify the first library as a dependency in your package.json file, so npm would install it before compiling the second library. Then, during the compile step, the first library's `.so` file would be available for linking. You can use a binding.gyp file to specify the paths to the necessary libraries and headers. > > Both of these methods have their advantages and disadvantages. The right choice depends on your specific use case and constraints. It's also possible there may be another solution that's more appropriate for your specific situation, as this is a somewhat complex and nuanced area of software development. > That's correct. If you're following the first approach, you won't be using the `-l` flag when you compile the second library. This is because you're not linking against the first library at compile time. Instead, you'll be accessing its symbols dynamically at runtime. > > In this scenario, the Node.js code for the second library should `require` the first library before it `require`s the second library. This ensures that the first library's symbols are loaded into memory and available for the second library to use. Here's a simple example: > > ```javascript > const firstLib = require('first-library'); > const secondLib = require('second-library'); > ``` > > In the C code for the second library, you should include the headers for the first library. This allows the compiler to check that you're using the first library's symbols correctly: > > ```c > #include "first_library_header.h" > ``` > > When you compile the second library, you'll need to make sure that the first library's header files are in the compiler's include path. You can use the `-I` flag to add directories to the include path: > > ```bash > gcc -I/path/to/first/library/headers -o second_library.o -c second_library.c > gcc -shared -o second_library.so second_library.o > ``` > > Remember to replace `/path/to/first/library/headers` with the actual path to the first library's header files. > > This approach should work as long as the first library's shared object file is loaded into memory before the second library's shared object file. You're essentially using Node.js and npm to manage the distribution and loading of the shared libraries, and relying on the dynamic linker to resolve symbols at runtime. This is a common pattern in plugin architectures, where plugins need to use symbols defined by the host application, but the host application doesn't know in advance what plugins will be loaded.

You said earlier that require would end calling dlopen which would check if a path to a shared object has already been loaded before and thus avoid double loading the same library. Here's a problem, if the shared object (dynamically linked library) is distributed via npm, they would get installed under the node_modules. Now the problem is that the project itself may exist at random paths. So the full absolute path to the node_modules would be different depending on the project, and therefore the full absolute path to the shared object would also be different at each time. Now because node_modules are technically flattened so as long as all dependencies load the same npm package version then all these paths would be the same, but it's possible that some dependencies may have a different package version, so then a different shared object might be loaded. Furthermore, during dynamic linking, do we have nesure that the -I path is the absolute path? How does loading the second library find the symbols of the first library?

Your observations are correct. In the context of Node.js and npm, the path to a native addon (or any package) can indeed vary between installations, as packages are installed under node_modules, and the project's location can be different on different systems. As such, different versions of the same library may get loaded due to version discrepancies in dependencies, which can lead to symbol conflicts and other issues.

Here's how the mechanism would work in general:

Loading the Libraries: When your Node.js script calls require on the first and then the second library, these libraries are loaded into memory. Node.js uses the dlopen function (or equivalent, depending on the platform) to do this. As you correctly pointed out, dlopen uses the absolute path to the library as the identifier to ensure that the same library isn't loaded multiple times. However, in this case, "the same library" means a library at the same path, not a library with the same content or the same version. This is why it's possible to end up with multiple versions of the same library loaded at the same time if they're located at different paths.

Including Headers: When you compile the second library, you include the first library's headers using the -I flag to specify the path to these headers. This is necessary for the compiler to verify that the second library's code is calling the first library's functions correctly. However, it doesn't actually affect the runtime behavior of the code. The compiler doesn't need to know where the first library's shared object file will be at runtime.

Resolving Symbols: When the second library is loaded, it needs to resolve the symbols (functions, variables, etc.) that it references from the first library. The dynamic linker handles this by searching through the libraries that have already been loaded into memory (by previous calls to dlopen) to find these symbols. It doesn't matter where on disk these libraries came from; what matters is that they're already loaded into memory.

As for handling different versions of the same library, it is indeed a complex issue. It's always best to ensure that all parts of your application are using the same version of any shared libraries to avoid inconsistencies and potential conflicts. npm's dependency resolution algorithm generally tries to install the latest version that satisfies all dependencies, but this can sometimes lead to multiple versions being installed if different parts of your application require incompatible versions. If you have control over all parts of your application, it would be best to ensure that they all depend on the same version of the shared library.

In general, managing dependencies, particularly native dependencies, can be one of the more complex aspects of software development. It requires careful design and planning to ensure that all parts of your application are compatible and can work together correctly.

What exactly happens if 2 native libraries were to be loaded into memory, and both exposed the same function name (symbol)?

If two native libraries are loaded into memory and they both expose the same function name, what happens depends on how these libraries are loaded and how the function is called.

If you load the two libraries in a Node.js application using require, each library will be loaded into its own isolated context. When you call a function from one of these libraries, Node.js will call the correct version of the function based on which require'd object you're using to access the function. This is because Node.js keeps track of the bindings between JavaScript objects and native libraries, so there's no confusion about which library a function call should go to.

This is a key benefit of using require to load native libraries in Node.js -- it isolates libraries from each other, so there's less risk of conflicts between them. It's similar to how JavaScript modules each have their own scope, and variables defined in one module don't interfere with variables in another module.

However, if you're dealing with libraries that are loaded and called directly from C or C++ code, it's possible to have conflicts if different libraries define a function with the same name. The behavior in this case depends on the specifics of the dynamic linker and how it's configured.

By default, when a program or library calls a function from a shared library, the dynamic linker searches through the loaded libraries in the order they were loaded, and it uses the first matching function it finds. So, if two libraries define a function with the same name, the version of the function in the library that was loaded first will be called.

Some systems also provide ways to control symbol resolution more precisely. For example, on Linux, you can use the RTLD_LOCAL flag with dlopen to specify that a library's symbols should not be made available to subsequently loaded libraries. Conversely, the RTLD_GLOBAL flag specifies that a library's symbols should be made available to subsequently loaded libraries. You can also use version scripts and other linker options to control symbol visibility and resolution.

In general, though, it's best to avoid having multiple libraries that define the same symbols, if possible. This can cause confusion and lead to bugs that are difficult to diagnose. If you're writing a library that's intended to be used as a shared library, it's a good practice to use unique names for your exported functions (such as by including your library's name as a prefix) to reduce the risk of conflicts.

So basically if I had dependencies relying on different versions of the same npm package, and that npm package exposed a native shared object that was loaded into memory. Then if a third dependency with its own native object were to call C function that was exposed by different versions of the same npm package, then the resolution will pick the C function that was exposed by whichever shared object was loaded first.

Yes, your understanding is correct. If you had multiple versions of the same shared library loaded into memory (each one from a different npm package) and a third native module tried to use a function that is defined in that shared library, the dynamic linker would resolve the function call to the version of the function in the library that was loaded first. This behavior is a result of how the dynamic linker handles symbol resolution.

The order in which libraries are loaded can be important in this situation. When Node.js loads native addons using require, it loads them in the order they are required in the JavaScript code. Therefore, if your JavaScript code first requires the npm package that includes the first version of the shared library, and then requires the npm package that includes the second version of the shared library, the first version of the shared library will be loaded into memory first, and its symbols will be used for resolution.

As a result, if the two versions of the shared library have different implementations of the same function, the version from the first-loaded library will be used, which might not be what you expect. This could potentially lead to subtle bugs if the different versions of the function have different behaviors.

It's generally best to ensure that all parts of your application are using the same version of any shared libraries to avoid this kind of issue. If that's not possible, you'll need to carefully manage the order in which libraries are loaded and be aware of how this affects symbol resolution.

Some notes on the standardising the API for custom TLS verification: https://github.com/MatrixAI/Polykey/issues/551#issuecomment-1730732555

Linear marked this as stale so I'm reopening.

MatrixAI / Polykey