BUG: Mark SEC000/000.Unclassified32ByteBase64String, SEC000/001.Unclassified64ByteBase64String, SEC101/101.AadClientAppLegacyCredentials, SEC000/001.Unclassified64ByteBase64String as DetectionMetadata.LowConfidence.
BUG: Mark SEC101/109.AzureContainerRegistryLegacyKey as DetectionMetadata.MediumConfidence.
BUG: Mark SEC101/030.NuGetApiKey, SEC101/105.AzureMessageLegacyCredentials, SEC101/110.AzureDatabricksPat,SEC101/050.NpmAuthorKey,SEC101/565.SecretScanningSampleToken as DetectionMetadata.HighConfidence.
PRF: Enable scan pre-filtering by declaring .servicebus as SEC101/105.AzureMessageLegacyCredentials signature.
I've introduced a new unit-test that forces a development invariant, that every pattern include an explicit confidence (i.e., accuracy or precision) designation, between High, Medium and Low. I've also updated doc comments to try to describe these categories. Maybe I should include this as well but typically:
Low == implemented only on an ad hoc basis for deep research or review purposes.
Medium == suitable to surface as warnings and/or bypassable findings in blocking scenarios.
High == very low noise rates, could be candidate for hard-blocking scenarios.
Identifiable == mathematically nearly certain to be a true positive. Suitable for hard-blocking/other prescriptive controls.
While doing this work I noticed a medium confidence pattern was missing a signature used for pre-filtering (in .NET, these values are confirmed to present in a scan target using string.IndexOf, a check which is empirically observed to be ~10x - 20x faster than non-back-tracking regex engines such as RE2. I've added a second invariant test that insists all patterns of medium confidence or higher include one or more signatures to enable the prec-check.
SEC000/000.Unclassified32ByteBase64String
,SEC000/001.Unclassified64ByteBase64String
,SEC101/101.AadClientAppLegacyCredentials
,SEC000/001.Unclassified64ByteBase64String
asDetectionMetadata.LowConfidence
.SEC101/109.AzureContainerRegistryLegacyKey
asDetectionMetadata.MediumConfidence
.SEC101/030.NuGetApiKey
,SEC101/105.AzureMessageLegacyCredentials
,SEC101/110.AzureDatabricksPat
,SEC101/050.NpmAuthorKey
,SEC101/565.SecretScanningSampleToken
asDetectionMetadata.HighConfidence
..servicebus
asSEC101/105.AzureMessageLegacyCredentials
signature.I've introduced a new unit-test that forces a development invariant, that every pattern include an explicit confidence (i.e., accuracy or precision) designation, between
High
,Medium
andLow
. I've also updated doc comments to try to describe these categories. Maybe I should include this as well but typically:While doing this work I noticed a medium confidence pattern was missing a signature used for pre-filtering (in .NET, these values are confirmed to present in a scan target using
string.IndexOf
, a check which is empirically observed to be ~10x - 20x faster than non-back-tracking regex engines such asRE2
. I've added a second invariant test that insists all patterns of medium confidence or higher include one or more signatures to enable the prec-check.@nguerrera @suvamM @rwoll @shaopeng-gh @yongyan-gh @LingZhou-gh @evelyn-ys