microsoft / service-fabric

Service Fabric is a distributed systems platform for packaging, deploying, and managing stateless and stateful distributed applications and containers at large scale.
https://docs.microsoft.com/en-us/azure/service-fabric/
MIT License
3.03k stars 399 forks source link

SF 7.1 certificate validation misinteprets CrlCheckingFlag as certificate chain status flags #1003

Open jberezanski-mdg opened 4 years ago

jberezanski-mdg commented 4 years ago

Service Fabric Runtime Version: 7.1.428

Environment: Azure

Description:

The Security/CrlCheckingFlag cluster setting is documented as containing revocation check flags passed to the CertGetCertificateChain function (via its dwFlags parameter, https://docs.microsoft.com/en-us/windows/win32/api/wincrypt/nf-wincrypt-certgetcertificatechain) and the sample values in the documentation (e.g. 0x10000000 = CERT_CHAIN_REVOCATION_CHECK_END_CERT) are consistent with this.

However, the certificate validation code introduced in SF 7.1 (System.Fabric.Management.WindowsFabricValidator.SecuritySettingsValidator.ExtractX509CredentialValidationRules()) treats this parameter as though it was supposed to contain certificate chain status flags (i.e. values from the CERT_TRUST_STATUS structure, https://docs.microsoft.com/en-us/windows/win32/api/wincrypt/ns-wincrypt-cert_trust_status) and attempts to match this parameter to the result of CertGetCertificateChain (called indirectly by X509Chain.Build()), which is completely incorrect.

Additionally, the code does not take the CrlCheckingFlag value into account when preparing the X509ChainPolicy object (which is used to set the dwFlags argument of CertGetCertificateChain) and uses a hardcoded value of X509RevocationMode.NoCheck (in System.Fabric.Common.CertificateManager.X509CertificateChainValidator constructor).

Luckily, the default value of CrlCheckingFlag (0x40000000 = CERT_CHAIN_REVOCATION_CHECK_CHAIN_EXCLUDE_ROOT) does not map to any currently defined CERT_TRUST_STATUS value (but this could change in the future). However, these two bugs make the new 7.1 certificate validation unreliable with respect to revocation checking.

(The above analysis is based on the decompiled code of SF 7.1.428, specifically, System.Fabric.Management.dll and System.Fabric.dll.)

Expected Behavior:

Certificate verification code correctly uses the value of CrlCheckingFlag to configure revocation check behavior by setting the appropriate properties of X509ChainPolicy.

OS(Windows/Linux): Windows


Assignees: @microsoft/service-fabric-triage

gkhanna79 commented 4 years ago

@dragav Please do triage.

dragav commented 4 years ago

@jberezanski-mdg thanks for the detailed report; based on a quick look at the code, your analysis seems correct (and warrants a fix). The intent was to suppress directly chain status errors (beyond just CRL checking), as the SF runtime does already mask out specific statuses. Please note, however, that:

jberezanski-mdg commented 4 years ago

Thank you for confirming my reasoning. I'm not affected by this in any way. I simply noticed this in code while diagnosing another validation issue (#1002) and wanted to let you know.

jberezanski-mdg commented 4 years ago

There is one other place in code related to certificate validation I suspect might not behave as expected.

System.Fabric.Common.CertificateManager.PinnedLeafX509CertificateChainValidator accepts a bool allowExpiredCerts parameter, which it later uses in its ExtendedCustomValidation() overload to conditionally add X509ChainStatusFlags.NotTimeValid to the list of allowed chain status flags and proceeds to verify the chain status against the extended allowed statuses list. However, the chain status is later verified again in the base class (System.Fabric.Common.CertificateManager.X509CertificateChainValidator) method Build(), which has no knowledge of the additional flag allowed by PinnedLeafX509CertificateChainValidator.

Therefore, it seems that passing allowExpiredCerts = true to PinnedLeafX509CertificateChainValidator will have no effect (so the cluster setting Security/AcceptExpiredPinnedClusterCertificate will be ignored by the security settings prevalidation).

dragav commented 4 years ago

I'm not sure how you determined this

Therefore, it seems that passing allowExpiredCerts = true to PinnedLeafX509CertificateChainValidator will have no effect

The security settings validator iterates through all of the certificate presentation declarations, matching them against the validation declarations; each validation declaration corresponds to one of the custom validators which extends, as you observed, the base class validation logic. That is, there will be multiple attempts to validate a cert, and so the chain is being built repeatedly. (That, and we have tests covering this scenario as well.)

dragav commented 4 years ago

(edited for formatting)

jberezanski-mdg commented 4 years ago

I mean specifically the PinnedLeafX509CertificateChainValidator class (used when the cluster configuration specifies certificates by thumbprint) and its parameter allowExpiredCerts.

Here is a reproduction in PowerShell.

$types = Add-Type -Path .\System.Fabric.dll -PassThru
$valType = $types | Where-Object Name -eq PinnedLeafX509CertificateChainValidator

$allowExpiredCerts = $true
$allowedChainStatusFlags = [System.Security.Cryptography.X509Certificates.X509ChainStatusFlags]::UntrustedRoot
$val = [Activator]::CreateInstance($valType, @($allowExpiredCerts, $allowedChainStatusFlags))

$validCert = New-SelfSignedCertificate -Type SSLServerAuthentication -Subject 'CN=TestCert-Valid' -NotBefore '2020-07-01' -NotAfter (Get-Date).AddDays(100) -CertStoreLocation Cert:\CurrentUser\My
$expiredCert = New-SelfSignedCertificate -Type SSLServerAuthentication -Subject 'CN=TestCert-Expired' -NotBefore '2020-07-01' -NotAfter '2020-07-10' -CertStoreLocation Cert:\CurrentUser\My

$val.Build($validCert) # returns true
$val.ChainStatus # shows UntrustedRoot

$val.Reset()
$val.Build($expiredCert) # should return true because allowExpiredCerts=true, but in fact returns false
$val.ChainStatus # shows UntrustedRoot and NotTimeValid
craftyhouse commented 3 years ago

Dropping a note to make sure it's a clear on this side. We do have this logged internally and I made sure it was bumped.