CSR feature for Keys domain to allow KN root certificates to be signed by an external CA

joshuakarp commented 3 years ago

Created by @CMCDragonkai

Allowing KN root certs to be trusted by an external CA allows PK KNs to be integrated into an existing PKI. Whether that's a public PKI or private PKI, it increases our compatibility with existing infrastructure.

This would be an interactive thing as PK has to generate a CSR. However our certificates don't have any kind of common name with respect to domains or anything, so I'm not sure if internet CAs will have any use here.

This requires some research.

Additional Context

Regarding: https://github.com/MatrixAI/Polykey-Design/issues/14

A "Public Key Infrastructure" PKI is fundamentally a centralised key server for handing out asymmetric keys/certificates representing identity in organisations or some trusted context. The DOD is one of the largest users of such a thing.

Alot of existing "user management" systems are in corporate LDAP systems. And lots of software at the end of the day integrate into LDAP for centralized user authentication. Consider that alot of single sign on systems integrate into LDAP such as SAML. https://auth0.com/blog/how-saml-authentication-works/

Polykey is not an enterprise authentication or identity system. So that's great. However we do envision that Polykey gets used by enterprises, and secrets are often associated with "people". But also machines as well. And that's what we want Polykey to be serving, the people that have associated identities in enterprise identity systems. But also machines that require secrets to be distributed including TLS certificates and other sorts of stuff.

Polykey unlike Vault, sits on both sides, the client side and the server side. And any server acts like a client as well. That's the really cool part in that it also supports complex integration on the end-user side. Thus allowing push/pull logic, and allowing flexibility in where that logic initiates.

That leads to be my other idea. Each Polykey agent can be a certificate authority. And each agent like a web of trust system, or a hierarchical X509 system is able to sign other polykey agents, and any hierarchy is what allows polykeys to push secrets to other polykeys.

Meaning an "identity server" is one representation of human identities. However asymmetric cryptosystems provide significant advantages at the trade off of some increased technical complexity. Thus PK/PKE can offer a way to bridge the gap between traditional user identity systems and asymmetric crypto systems, a sort of "hybrid identity system" that is both centralized and decentralized at the same time.

CMCDragonkai commented 3 years ago

This may require old PKI code. I'm adding the old code for some PKI testing here in case it helps:

PKI.test.ts

```ts /* eslint-disable */ import fs from 'fs'; import os from 'os'; import net from 'net'; import http from 'https'; import forge from 'node-forge'; import * as grpc from '@grpc/grpc-js'; import { randomString } from '../../../src/utils'; import { KeyManager } from '../../../src/Polykey'; import { NodeMessage, SubServiceType } from '@/proto/js/Node_pb'; import { NodeClient, NodeService } from '@/proto/js/Node_grpc_pb'; import { TLSCredentials } from '../../../src/nodes/pki/PublicKeyInfrastructure'; // TODO: part of adding PKI functionality to polykey describe('PKI testing', () => { let tempDirNodeCA: string let kmCA: KeyManager let tempDirNodeA: string let kmA: KeyManager let tempDirNodeB: string let kmB: KeyManager beforeAll(async () => { // ======== CA PEER ======== // // Define temp directory tempDirNodeCA = fs.mkdtempSync(`${os.tmpdir}/pktest${randomString(5)}`) // Create pki kmCA = new KeyManager(tempDirNodeCA, fs) await kmCA.generateKeyPair('kmCA', 'passphrase') // ======== PEER A ======== // // Define temp directory tempDirNodeA = fs.mkdtempSync(`${os.tmpdir}/pktest${randomString(5)}`) // Create pki kmA = new KeyManager(tempDirNodeA, fs) await kmA.generateKeyPair('kmA', 'passphrase') kmA.pki.addCA(kmCA.pki.RootCert) // ======== PEER B ======== // // Define temp directory tempDirNodeB = fs.mkdtempSync(`${os.tmpdir}/pktest${randomString(5)}`) // Create pki kmB = new KeyManager(tempDirNodeB, fs) await kmB.generateKeyPair('kmB', 'passphrase') kmB.pki.addCA(kmCA.pki.RootCert) }) afterAll(() => { fs.rmdirSync(tempDirNodeCA, { recursive: true }) fs.rmdirSync(tempDirNodeA, { recursive: true }) fs.rmdirSync(tempDirNodeB, { recursive: true }) }) test('can request a certificate from a ca node', () => { const csr = kmA.pki.createCSR('localhost', 'passphrase') const certificate = kmCA.pki.handleCSR(csr) expect(certificate).not.toEqual(undefined) expect(1+1).toEqual(2); }) describe('Transport Layer Security', () => { let tlsServerCredentials: TLSCredentials let tlsClientCredentials: TLSCredentials beforeAll(() => { // request certificates from CA for both kmA.pki and kmB.pki // ==== PEER A ==== // const csrA = kmA.pki.createCSR('localhost', 'passphrase') kmA.pki.importCertificate(kmCA.pki.handleCSR(csrA)) // ==== PEER B ==== // const csrB = kmB.pki.createCSR('localhost', 'passphrase') kmB.pki.importCertificate(kmCA.pki.handleCSR(csrB)) // kmA.pki will provide the server credentials and kmB.pki will provide the client credentials tlsServerCredentials = kmA.pki.TLSServerCredentials! tlsClientCredentials = kmB.pki.TLSClientCredentials! }) test('can use certificates to create an mtls connection', done => { // set up the mock server const randomSecureMessage = `random-secure-message: ${randomString(5)}\n` const server = http.createServer({ key: tlsServerCredentials!.keypair.private, cert: tlsServerCredentials!.certificate, ca: [tlsServerCredentials!.rootCertificate], // requestCert: true }, (req, res) => { res.writeHead(200); res.end(randomSecureMessage); }).listen(0, 'localhost', () => { const serverAddress = server.address() const req = http.request({ host: 'localhost', port: serverAddress.port, path: '/', method: 'GET', key: tlsClientCredentials!.keypair.private, cert: tlsClientCredentials!.certificate, ca: [tlsClientCredentials!.rootCertificate] }, (res) => { res.on('data', (d) => { expect(d.toString()).toEqual(randomSecureMessage) done() }); }) req.on('error', (e) => { expect(e).toBeUndefined() done() }); req.end() }) }) }) describe('gRPC TLS', () => { let tlsServerCredentials: TLSCredentials let tlsClientCredentials: TLSCredentials beforeAll(() => { // request certificates from CA for both kmA.pki and kmB.pki // ==== PEER A ==== // const csrA = kmA.pki.createCSR('localhost', 'passphrase') kmA.pki.importCertificate(kmCA.pki.handleCSR(csrA)) // ==== PEER B ==== // const csrB = kmB.pki.createCSR('localhost', 'passphrase') kmB.pki.importCertificate(kmCA.pki.handleCSR(csrB)) // kmA.pki will provide the server credentials and kmB.pki will provide the client credentials tlsServerCredentials = kmA.pki.TLSServerCredentials! tlsClientCredentials = kmB.pki.TLSClientCredentials! }) test('can create a gRPC server and client', done => { const server = new grpc.Server(); server.addService(NodeService, { messageNode: async (call, callback) => { const nodeRequest: NodeMessage = call.request; // echo server callback(null, nodeRequest); }, }); const serverCredentials = grpc.ServerCredentials.createSsl( Buffer.from(tlsServerCredentials.rootCertificate), [ { private_key: Buffer.from(tlsServerCredentials.keypair.private), cert_chain: Buffer.from(tlsServerCredentials.certificate), }, ], true ); const clientCredentials = grpc.ChannelCredentials.createSsl( Buffer.from(tlsClientCredentials.rootCertificate), Buffer.from(tlsClientCredentials.keypair.private), Buffer.from(tlsClientCredentials.certificate) ); server.bindAsync(`localhost:0`, serverCredentials, async (err, boundPort) => { if (err) { throw err; } else { server.start(); const nodeClient = new NodeClient(`localhost:${boundPort}`, clientCredentials); const nodeRequest = new NodeMessage() nodeRequest.setPublicKey('some pub key') nodeRequest.setSubMessage('sub message') nodeRequest.setType(SubServiceType.GIT) nodeClient.messageNode(nodeRequest, (err, response) => { if (err) { expect(err).toEqual(undefined) } else { expect(response).toEqual(nodeRequest) } done() }); } }); }) }) describe('Node Forge TLS', () => { let tlsServerCredentials: TLSCredentials let tlsClientCredentials: TLSCredentials beforeAll(() => { // request certificates from CA for both kmA.pki and kmB.pki // ==== PEER A ==== // const csrA = kmA.pki.createCSR('server', 'passphrase') kmA.pki.importCertificate(kmCA.pki.handleCSR(csrA)) // ==== PEER B ==== // const csrB = kmB.pki.createCSR('client', 'passphrase') kmB.pki.importCertificate(kmCA.pki.handleCSR(csrB)) // kmA.pki will provide the server credentials and kmB.pki will provide the client credentials tlsServerCredentials = kmA.pki.TLSServerCredentials! tlsClientCredentials = kmB.pki.TLSClientCredentials! }) test('node forge tls test work with custom certificates', done => { const end: any = {}; let success = false; // create TLS client end.client = forge.tls.createConnection({ server: false, caStore: [forge.pki.certificateFromPem(tlsServerCredentials.certificate)], sessionCache: {}, // supported cipher suites in order of preference cipherSuites: [ forge.tls.CipherSuites.TLS_RSA_WITH_AES_128_CBC_SHA, forge.tls.CipherSuites.TLS_RSA_WITH_AES_256_CBC_SHA ], virtualHost: 'server', verify: function (c, verified, depth, certs) { console.log( 'TLS Client verifying certificate w/CN: "' + certs[0].subject.getField('CN').value + '", verified: ' + verified + '...'); return verified; }, connected: function (c) { console.log('Client connected...'); // send message to server setTimeout(function () { c.prepareHeartbeatRequest('heartbeat'); c.prepare('Hello Server'); }, 1); }, getCertificate: function (c, hint) { console.log('Client getting certificate ...'); return tlsClientCredentials.certificate; }, getPrivateKey: function (c, cert) { return tlsClientCredentials.keypair.private; }, tlsDataReady: function (c) { // send TLS data to server end.server.process(c.tlsData.getBytes()); }, dataReady: function (c) { const response = c.data.getBytes(); console.log('Client received "' + response + '"'); success = (response === 'Hello Client'); expect(success).toEqual(true) c.close(); }, heartbeatReceived: function (c, payload) { console.log('Client received heartbeat: ' + payload.getBytes()); }, closed: function (c) { expect(success).toEqual(true) done() }, error: function (c, error) { console.log('Client error: ' + error.message); } }); // create TLS server end.server = forge.tls.createConnection({ server: true, caStore: [forge.pki.certificateFromPem(tlsClientCredentials.certificate)], sessionCache: {}, // supported cipher suites in order of preference cipherSuites: [ forge.tls.CipherSuites.TLS_RSA_WITH_AES_128_CBC_SHA, forge.tls.CipherSuites.TLS_RSA_WITH_AES_256_CBC_SHA], connected: function (c) { console.log('Server connected'); c.prepareHeartbeatRequest('heartbeat'); }, verifyClient: true, verify: function (c, verified, depth, certs) { console.log( 'Server verifying certificate w/CN: "' + certs[0].subject.getField('CN').value + '", verified: ' + verified + '...'); return verified; }, getCertificate: function (c, hint) { console.log('Server getting certificate for "' + hint[0] + '"...'); return tlsServerCredentials.certificate; }, getPrivateKey: function (c, cert) { return tlsServerCredentials.keypair.private; }, tlsDataReady: function (c) { // send TLS data to client end.client.process(c.tlsData.getBytes()); }, dataReady: function (c) { console.log('Server received "' + c.data.getBytes() + '"'); // send response c.prepare('Hello Client'); c.close(); }, heartbeatReceived: function (c, payload) { console.log('Server received heartbeat: ' + payload.getBytes()); }, closed: function (c) { console.log('Server disconnected.'); }, error: function (c, error) { console.log('Server error: ' + error.message); } }); console.log('created TLS client and server, doing handshake...'); end.client.handshake(); }) }) }) ```

CMCDragonkai commented 1 year ago

The ACME protocol would most suitable for this https://datatracker.ietf.org/doc/html/rfc8555.

The ACME protocol could also help with "cluster membership" #403.

CMCDragonkai commented 1 year ago

This feature might actually be useful bootstrapping into a PKE portal.

I noticed that in tailscale if you wanted to allow automated agents to bootstrap into the tailnet, you had to pre-generate a shared secret key in the auth keys. These pre-generated keys can have conditions like one-time-use, expiry and so-on.

But basically you end up setting it as an env variable when you boot up the tailscale node, which will then generate device keys (once). The device keys can have their own expiry independent of this "auth key" provided by tailscale.

This is very similar to PK's bootstrapping concept. Instead of device keys, we have the PK agent's root key which is pub/priv ed25519 pair. These are basically the "device keys" as per tailscale.

Now do we expect users to go into their PKE portal to generate an auth key ahead of time? This is contrasted with an "interactive setup", where the user of the tailscale node can be prompted, to go into the browser or some other mechanism and go through an authentication loop. In tailscale, this also requires "approval" by an administrative user, but this can also be optional since auth keys can be pre-approved. In tailscale there are multiple levels/layers of permission groups. It's actually a bit all over the place rather than a single elegant abstraction. For example, there is who is the "owner/creator" that corresponds to the node, that's the identity that logs into tailscale, there is "approval", there is enabling/disable key expiry or enabling/disabling certain capabilities (like being an exit node), there is tagging, and after tagging the node, it replaces any permissions granted by the owner/creator. You can also add multiple tags which creates overlapping permissions which could be additive or subtractive or even conflicting (apparently it is resolved using a first-match rule).

This situation creates a lot of confusion and results in security-configuration drift (as we've talked alot about in Matrix OS context).

Anyway, going through tailscale's exit node deployment https://github.com/patte/fly-tailscale-exit, demonstrates that fundamentally there are 2 bootstrapping techniques:

Eager/Ahead of Time - can be done "headless", pre-generated pre-shared secret key, with various conditions like one-time use... etc
Lazy/Just in Time - requires some sort of interactive prompt - could be a browser loop, could be OAuth2, device code flow... etc
Is there a third way? Like a Zero Knowledge Proof?

Now we do have something quite interesting in our case. We can make use of CSR. Especially given that we expect key rotation and key migration to occur. It would make sense that we can continue to authenticate nodes to the PKE portal as long as key exists in a hierarchy that is signed/trusted by the certificate authority, of which the PKE portal will each have a certificate authority, possibly provided by the first seed node.

The idea of relying on CSR, and certificate hierarchy, is that we end up with 2 kinds of "trust chains".

The trust chain of X.509 certificate hierarchy - this is less flexible and relies on signatures on the certificates themselves
The sigchain system (representing an abstraction of property certificates) - creates graphs/gestalts, is also more flexible

In the first case, each certificate and their information represents identity.

In the second case, each claim is a block on a blockchain, and they can represent arbitrary information.

The second case is fundamentally more flexible, as it is our own creation, and we can create far more flexible gestalt graphs.

Both cases allows you to create discrete and overlapping "trust networks". (To encode computationally).

One downside of the first case is that it's very restricted to the X.509 spec which is very old and very weird and clunky. In fact we have already extended beyond the X.509 spec by using additional properties that store signatures. Consider the fact that all our certs are self-signed, but also signed by the "parent cert".

One advantage is that X.509 certificates are well understood, and can be used to bridge into regular non-PK things like browsers. (Even this isn't perfect, because browsers don't understand Ed25519 certificates atm). And furthermore we still need to be able to support multi-certs in our networking systems like QUIC and WebSockets.

There is some confluence between sigchain and x.509 trust networks, and if we can find the right abstraction for both, we could have something quite nifty.

CMCDragonkai commented 11 months ago

This will be needed for 2 things:

The movement to using SRV records in #599 allows dynamic seed nodes.
The ability to create private networks in PKE.

CMCDragonkai commented 1 month ago

@tegefaulkes this is an old issue regarding the CSR of PK nodes to external CAs. However this may not work well due to conflict between what we are using the signature field to mean, and what external CAs mean with their signature field. Furthermore it is important to review whether we want PKE's seed nodes (as a PKI functionality CVP 2./CVP 3.) to form internal CAs or external CAs. Unlike regular web based external CAs, PK's external facing "trust" isn't just about hostname/DNS ownership, but can be far more generic.

You want to make sure to review old issues, and attach it into the relevant project graph/tree.

MatrixAI / Polykey

CSR feature for Keys domain to allow KN root certificates to be signed by an external CA #154

Additional Context