Venafi / vcert

Go client SDK and command line utility designed to simplify integrations by automating key generation and certificate enrollment using Venafi machine identity services.
https://support.venafi.com/hc/en-us/articles/217991528
Apache License 2.0
88 stars 64 forks source link

Provide the ability to reset the certificate object in Venafi TPP #239

Open sitaramkm opened 2 years ago

sitaramkm commented 2 years ago

BUSINESS PROBLEM If the downstream CA service is down for any reason, Venafi TPP changes the status of certificate object to Error.

The scenario to reproduce this is simple cert-manager--->Venafi TPP--->MSCA

cert-manager and MSCA could be replaced with any consumer and provider.

PROPOSED SOLUTION vCert provides a mechanism to reset the certificate object so consumers can attempt to heal the situation.

CURRENT ALTERNATIVES Currently, the only way to recover is to manually reset the certificate object in the UI and retry a renewal via API.

hawksight commented 2 years ago

Perhaps this could be exposed in the vcert library, but also as a command line option as well?

E.g. a reset sub command something like:

USAGE:
   vcert [global options] command [command options] [arguments...]

ACTIONS:

   gencsr       To generate a certificate signing request (CSR)
   enroll       To enroll a certificate
   pickup       To retrieve a certificate
   renew        To renew a certificate
   reset        To reset a certificate state
   revoke       To revoke a certificate

Requiring either an id to target which certificate to reset:

vcert reset -u <HOST> -t <TOKEN> --id <CERT_ID> 
tr1ck3r commented 2 years ago

Thank you for raising this issue @sitaramkm. This problem is a side effect of TPP's object-based design and does not apply to Venafi as a Service. As such, that gives me reservations about adding a new action (or a new option to an existing action) as @hawksight proposed. Instead I think this "self-healing" should just be how VCert behaves. No certificate request should ever be influenced by the success or failure of any previous certificate request.

Guidance for anyone contributing this update to the project:

It is important we avoid introducing additional API calls for the majority case (i.e., where the request succeeds because there is no existing certificate object in error). That means not adding logic before every request to check whether a certificate object already exists and, if so, whether it is "in error". Instead the reset/retry logic should only be triggered by the POST /vedsdk/certificates/retrieve failing with an HTTP 500 error response (it will have the following body after making an API request for a certificate object that was in error).

{
 "Stage": 500, 
 "Status": "WebSDK CertRequest Module Requested Certificate"
}

That error confirms there was a certificate object in error state prior to the current request being made and it should trigger a POST /vedsdk/certificates/reset call with "Restart": false followed by repeating the POST /vedsdk/certificates/request call with the original payload. This won't guarantee the certificate request will be successful but it will ensure that the current certificate request is always attempted.

maelvls commented 1 year ago

This issue was fixed in vcert v4.23.0 (https://github.com/Venafi/vcert/pull/269). Regarding cert-manager, the issue will be fixed as part of 1.11 (https://github.com/cert-manager/cert-manager/pull/5674).

If you are hitting one of the two error messages:

unable to retrieve: Unexpected status code on TPP Certificate Retrieval. Status: 500 Certificate has encountered an error while processing, Status: WebSDK CertRequest Module Requested Certificate, Stage: 400.

or

unable to retrieve: Unexpected status code on TPP Certificate Retrieval. Status: 500 Certificate has encountered an error while processing, Status: This certificate cannot be processed while it is in an error state. Fix any errors, and then click Retry., Stage: 400.

then I recommend that you upgrade to vcert v4.23.0 (the stage number doesn't matter in the above messages).

luispresuelVenafi commented 1 year ago

This use case was addressed by v4.23.0

maelvls commented 1 year ago

@luispresuelVenafi Could we re-open this issue? Although this issue was fixed in 4.23.0, it was then reverted in VCert 4.24.0. More context is available in https://github.com/Venafi/vcert/issues/273#issuecomment-1556938953.

luispresuelVenafi commented 1 year ago

@maelvls sure. This still a pending issue due to revert.

maelvls commented 11 months ago

This has been partially fixed in VCert 5.0.0 with the introduction of the ResetCertificate Go function (https://github.com/Venafi/vcert/pull/295).

No CLI command was added though (e.g., vcert reset). I know that @hawksight talked about vcert reset, is it still needed?