loco-rs / loco

🚂 🦀 The one-person framework for Rust for side-projects and startups
https://loco.rs
Apache License 2.0
5.18k stars 214 forks source link

Redact logger fields #111

Open kaplanelad opened 10 months ago

kaplanelad commented 10 months ago

We should explore incorporating a mechanism to automatically redact sensitive fields when users log data. If users input information such as tokens, API keys, passwords, or similar sensitive data, we can enhance security by obfuscating these fields with asterisks (***).

This list of sensitive fields can be conveniently managed through our configuration settings, providing flexibility and making it an optional feature.

jayy-lmao commented 10 months ago

This has been done in Zero To Production in Rust by using the New Type pattern (I.e. having a 'Secret' type), where the Display and Debug implementations do exactly as you say.

ronlobo commented 9 months ago

Any PII (personal identifiable information) data should be redacted, which should lead to compliance with GDPR, CISPA & Co.

They also require encryption in the storage layer for PII data, which for example can be solved through pgcrypto in Postgres.

Anyways, I also read the book (Zero to Production in Rust) and can apply this.

@jayy-lmao you wanna take it? Otherwise I'm happy to take a peek :)

kaplanelad commented 9 months ago

@ronlobo Yes, it will be amazing! Go for it. let me know if you need my assist

ronlobo commented 9 months ago

@kaplanelad Did a bit of research what's available already, looks like there are a couple of options, however, Veil seems to offer the most flexibility and has a couple of active contributors.

Determining which fields are sensitive fields and need to be redacted is ultimately a decision the end user has to make.

How would you envision the above making it available in the config file? Would we use the crate internally (e.g.: user model) and make it available as a feature t end users?

kaplanelad commented 9 months ago

@ronlobo, thank you for your thorough investigation!

I believe it's essential to establish the Definition of Done for this task. From my perspective, we should streamline the implementation and empower the user with the ability to configure reduction fields, along with a list of default reductions provided as part of the local template.

To illustrate, within the configuration YAML file under the logger section, we should introduce a new reduce subsection that looks like this:

# Application logging configuration
logger:
  .
  .
  reduce:
    keys:
      - TOKEN
      - GITHUB_TOKEN
    patterns:
      - REGEX

Subsequently, when a user writes a log in Rust:

tracing::info!(token = XXXX, "fetching data");

The expected output log, based on the YAML configuration, would appear as follows:

{....,"level":"INFO","fields":{"message":"fetching data","token":"*****"},"target":"myapp::controllers::home"}

in this way, there is no need to set up which fields in the struct are reduced. You may also want to explore a library I developed: redact-engine. It's integrated with env-logger. Perhaps we could extend its support to tracing.

What are your thoughts on this?

ronlobo commented 8 months ago

Hey @kaplanelad, yes, definition of done makes sense.

I will take a look into redact-engine and tracing support.

joshka commented 1 week ago

You may also consider using Secrecy for this as it can be useful to have new types that support being redacted on serialization correctly. There are pros and cons of the field vs type approach, but to me the big one is that the field approach leads to fields being manually tagged (don't display this particular credit card field) while the type approach means that any time the type appears in any other type as a field it will be redacted. I see that as a huge benefit of a securely designed approach to PII / sensitive data management.