microsoft / security-utilities

Security utilities for key generation, string redaction, etc.
MIT License
25 stars 11 forks source link

Add SEC101_061_LooseOAuth2BearerToken and notion of confidence #45

Closed michaelcfanning closed 4 months ago

michaelcfanning commented 5 months ago

This change first adds a detection requested by a partner, for detecting and redacting a loose OAuth2.0 bearer token. The data stored in this header location appears to itself have different patterns and conventions (for example, a simple base64 encoded token or a CBOR web token).

The corresponding rule doesn't make an effort to classify these or try to determine whether the provided value is recognizable, e.g., as a 32-byte base64-encoded value.

That leads me to mark is a 'low confidence'. i.e., we expect to redact test/other non-sensitive data as well as actual secrets with this pattern.

michaelcfanning commented 5 months ago
"Name": "Unclassified32ByteBase64String",

So what do we designate these checks in terms of confidence? On the one hand, we are pretty sure we have a 32-byte base64 string. :) On the other, we don't have much confidence we've identified a secret.

It all just shows how hard it is to categorize these checks. :)

I haven't provided any confidence at all yet, because these checks really only describe the literal format of the detected token, there is no attempt to categorize as a secret whatsoever. We should think about and make a call at some point though.

btw - the application of these patterns is to sample/review data for missed detections.


Refers to: GeneratedRegexPatterns/UnclassifiedPotentialSecurityKeys.json:26 in 946a003. [](commit_id = 946a00378b46d06e03517860d56e3a9b4080a56d, deletion_comment = False)

michaelcfanning commented 5 months ago
"Id": "SEC000/000",

So what do we designate these checks in terms of confidence? On the one hand, we are pretty sure we have a 32-byte base64 string. :) On the other, we don't have much confidence we've identified a secret.

It all just shows how hard it is to categorize these checks.

I haven't provided any confidence at all yet, because these checks really only describe the literal format of the detected token, there is no attempt to categorize as a secret whatsoever. We should think about and make a call at some point though.

btw - the application of these patterns is to sample or review data for missed detections.


Refers to: GeneratedRegexPatterns/UnclassifiedPotentialSecurityKeys.json:25 in 946a003. [](commit_id = 946a00378b46d06e03517860d56e3a9b4080a56d, deletion_comment = False)