caddyserver / certmagic

Automatic HTTPS for any Go program: fully-managed TLS certificate issuance and renewal
https://pkg.go.dev/github.com/caddyserver/certmagic?tab=doc
Apache License 2.0
5.05k stars 293 forks source link

Add interface for client-side encryption #158

Open chaudyg opened 2 years ago

chaudyg commented 2 years ago

What would you like to have changed?

Provide a mechanism for client-side encryption.

Why is this feature a useful, necessary, and/or important addition to this project?

Several storage implementations already support some form of client-side encryption, but each using a slightly different encryption mechanism:

Adding a mechanism for client-side envelope encryption to cert-magic would :

Here are some initial thoughts:

    Encrypt func(ctx context.Context, data []byte) ([]byte, error)
    Decrypt func(ctx context.Context, data []byte) ([]byte, error)

This would delegate the responsibility to the caller to provide the encryption mechanisms.

I would love to get the community's opinion this before doing a first pass implementation.

francislavoie commented 2 years ago

FWIW the current config structure for storage in Caddy wouldn't allow for this, because there's no sibling config to the storage modules that would allow the caller (Caddy/CertMagic) to have config about how it calls the storage module. Like, the config is like this:

"storage": {
    "module": "file_system",
    ...
}

But to support this, it would need to be in a structure like this (imaginary):

"storage_config": {
    "encrypt_key": "foobar",
    "storage": {
        "module": "file_system",
        ...
    }
}

So it would be a breaking change from a config standpoint, not only an implementation breakage (like the context thing is)

mholt commented 2 years ago

Thanks for proposing this, let's discuss:

Improve security: Client-side encryption adds another layer of security on top of encryption that might happen server-side (for example with s3 or gcs). Encryption would occur before data is sent to the Storage backend. A potential attacker would need to compromise both the storage backend and acquire the encryption key to be able to use the data.

Encryption already (usually?) occurs before data is sent to (remote) storage backends: that's what TLS is for.

Encrypting data at rest, however, is tricky for server applications. What are the specific threat models you are trying to defend against, and what is your use case for said threat models? Usually encrypting databases is a sign you don't trust the system or service your database is stored on, and that's a bigger problem than what encryption can solve.

chaudyg commented 2 years ago

Thanks @mholt and @francislavoie for your answers ☺️ !

Encryption already (usually?) occurs before data is sent to (remote) storage backends: that's what TLS is for.

First, sorry if this bit was confusing. Indeed I am only considering Encryption at Rest here and not Encryption in Transit. Some provides such as s3/gcs offer server-side encryption (the data is encrypted when reaching the storage backend), but here I am only focusing on client-side-encryption (the data is encrypted before being sent to the backend).

Encrypting data at rest, however, is tricky for server applications. What are the specific threat models you are trying to defend against, and what is your use case for said threat models? Usually encrypting databases is a sign you don't trust the system or service your database is stored on, and that's a bigger problem than what encryption can solve.

Certificates is very sensitive data, and it is critical to keep it safe from eavesdropping or tempering. Restricting least-privilege access to storage is essential to prevent exposing the raw data. But what if the access control mechanism fails? then a potential attacker would have access to the raw data in storage.

I don't think this is about "trusting the system" but more about providing multiple layers of protection as part of an defence-in-depth strategy. Encryption can provide an additional layer of protection and mitigate a weakness of the primary access control mechanism. Several cert-magic storage backend already seem to provide a way to encrypt the data, but the choice of encryption algorithm is set in stone which isn't great if you need stronger encryption (or for compliance).

So it would be a breaking change from a config standpoint, not only an implementation breakage

yeah, i figured that part one might be tricky πŸ€”. I haven't dig much into how modules are managed yet so sorry for my ignorance here. Indeed we would need an optional "encryption" config section. Would that necessarily be a breaking change? What prevents us to setup encryption directly under the "storage" section?

francislavoie commented 2 years ago

What prevents us to setup encryption directly under the "storage" section?

That config is controlled by the storage implementations themselves (unpacked into the struct from the storage module), there's nowhere for config that is controlled by Caddy/CertMagic, without breaking changes.

Personally I don't see the problem letting storage implementations manage encryption themselves. We could ask them to conform to some standard approach, like we could set up a encryption module namespace that storage modules could use, not sure the complexity is worth it though. Config for that would look like:

"storage": {
    "module": "file_system",
    // options for `file_system`...

    // optional encryption (omit the "encryption" field for none)
    "encryption": {
        "module": "aes",
        "key": "foobar"
    }
}

We'd have modules like encryption.aes or encryption.aws_kms or whatever, essentially modules under the encryption.* namespace.

Point is that Caddy/CertMagic wouldn't invoke the encryption, libs like caddy-tlsredis would accept an encryption module as config and use it if available.

Also, the above would effectively turn this into a Caddy issue, instead of a CertMagic issue, to be clear, because CertMagic wouldn't care, it would be only a concern for Caddy at the config level (and Caddy would define the encryption.* interface I guess) and the storage modules which offer a Caddy compatible module would accept in their config.

Relevant reading to understand how modules work: https://caddyserver.com/docs/architecture, and the basics of plugins: https://caddyserver.com/docs/extending-caddy

chaudyg commented 2 years ago

Thanks for the links, very helpful πŸ’ͺ .

We'd have modules like encryption.aes or encryption.aws_kms or whatever, essentially modules under the encryption.* namespace.

I like the approach of having one module per EncryptionProvider. IIUC we would need a dedicated module namespace to ensure each encryption provider implements the same interface.

Personally I don't see the problem letting storage implementations manage encryption themselves. We could ask them to conform to some standard approach, like we could set up a encryption module namespace that storage modules could use, not sure the complexity is worth it though.

I am not certain this approach would bring much value. IIUC, each backend would only support its own set of encryptions, its own implementation. I feel that extending cert-magic itself with this capability would be a lot more powerful given the encryption provider would be completely independent from the storage provider.

To continue on that thought, I agree the storage section isn't the right place for this πŸ€” . Would the AutomationConfig be a better place? given that's the one governing the tls cetificate management? Something along the lines of

// imaginary
{
    "on_demand": {...},
    "encryption": {
        //  optional encryption (omit the "encryption" field for none)
        "module": "aws_kms",
        "keyId": "123456"
    },
   "policies": [{
        "subjects": [""],
        "storage": {...},
        "encryption": {...}, // Optionally configure a separate encryption module.
        ...
    }],
    ...
}
mholt commented 2 years ago

@francislavoie

So it would be a breaking change from a config standpoint, not only an implementation breakage (like the context thing is)

Are you sure? Why can't a storage module simply be configured to enable encryption, along with its other configuration parameters? Just add some fields to the struct for the storage provider you want to support encryption with.

francislavoie commented 2 years ago

@mholt

Are you sure? Why can't a storage module simply be configured to enable encryption, along with its other configuration parameters?

That's exactly what I was saying.

What was suggested here is that Caddy/CertMagic would encrypt before passing it to the storage module. There's nowhere in the config currently where it would be possible to configure encryption outside of the storage module, because the only configuration for storage is the storage module.

So either the config for it is in the storage module itself, or somewhere else altogether (like @chaudyg just suggested, in the automation config, so only certs/keys would be encrypted, not all storage).

@chaudyg

I like the approach of having one module per EncryptionProvider. IIUC we would need a dedicated module namespace to ensure each encryption provider implements the same interface.

Yep.

I am not certain this approach would bring much value. IIUC, each backend would only support its own set of encryptions, its own implementation. I feel that extending cert-magic itself with this capability would be a lot more powerful given the encryption provider would be completely independent from the storage provider.

Not quite, anything could define encryption modules, including dedicated separate plugins. The encryption module doesn't need to come from the storage plugin (probably shouldn't, really).

I'd suggest Caddy core could provide the AES module because that's super simple and easy, basically no knobs to turn there; that would mean there would always be at minimum "no encryption" and "aes" modes available for configuration, and any other modes could be provided by plugins that are more use-case specific (like loading from a KMS or whatever).

mholt commented 2 years ago

Ah, I see. You want to configure encryption independent of the storage module... that would add new config surface, but I don't think it would have to be a breaking change per-se.

I am not comfortable implementing or reviewing encryption code for this use case. I'd probably require a second set of eyes from an established cryptographer. (But don't have the budget to pay for it. It'd likely have to be a donated or volunteer review.)

chaudyg commented 2 years ago

I am not comfortable implementing or reviewing encryption code for this use case.

I agree. Encryption is never obvious and it's an area where we need to be careful. I've experimenting with Google Tink for Go lately (https://github.com/google/tink/blob/master/docs/GOLANG-HOWTO.md). Delegating encryption to something like Tink might reduce the risk of doing the wrong think:

How do you feel about a dedicated "Tink Encryption module" ?

mholt commented 2 years ago

It doesn't look like a bad option; would still want some expert review though.

Before we get into specifics, I still think we should figure out first whether this would be storage-implementation-specific, or a single "pre-storage" point of configuration (i.e. storage plugins remain unchanged, they just read and write encrypted bytes instead). Because it looks like some storage implementations already offer encryption, and I'm still a bit fuzzy on the benefits of at-rest encryption for these materials when the key itself would be in plaintext, potentially also in storage of some sort (like when configs are persisted).

Or maybe a storage wrapper is a better solution, so a generic encrypting storage module that wraps any other underlying module.

I dunno. Feel like this still needs more thorough discussion / contemplation.