Venafi / vault-pki-backend-venafi

Venafi PKI Secrets Engine plugin for HashiCorp Vault that enables certificate enrollment using Venafi machine identity services.
Mozilla Public License 2.0
54 stars 20 forks source link

Need explanation of error messages #96

Open GeoffVenafi opened 2 years ago

GeoffVenafi commented 2 years ago

Hi Team,

I have a customer that is running into some error messages while they are using the Vault-PKI-Backend-Venafi. They would like to know the reason for these errors so the devops team can create some error handling to better address these errors as they come up.

Here are the errors they are concerned about:

  1. {"errors":["unable to retrieve: Unexpected status code on TPP Certificate Retrieval. Status: 500 Certificate \VED\Policy\Integrations\HashiCorp\Test\Standard\gposetup-rms-oytydev1.ose-dev39-red.aws-use1.cloud.marriott.com has encountered an error while processing, Status: This certificate cannot be processed while it is in an error state. Fix any errors, and then click Retry., Stage: 500."]}

NOTE: We believe this error is related to the CA not responding in time and Venafi places the cert in Error

  1. ERROR: {"errors":["2 errors occurred:\n\t errors from both primary and secondary; primary error was unable to retrieve: Post https://venafiintegration.marriott.com/vedsdk/certificates/retrieve: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers); secondary errors follow\n\t unable to retrieve: Post https://venafiintegration.marriott.com/vedsdk/certificates/retrieve: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)\n\n"]}

NOTE: We believe this is due to Venafi not getting the cert within the 60 second time limit for the CSR to stay within HashiCorp Vaults memory, and the cert cannot be completed

  1. [ERROR] core: failed to register lease: request_path=venafi-pki/issue/tpp-backend error=\"rpc error: code = Canceled desc = context canceled\""}

NOTE: Not sure what caused this, but we had a lot of these at once, so maybe an issue of Vault reaching TPP?

Let me know if you need any additional information for this request.

Thanks, Geoff

maelvls commented 1 year ago

I can comment on the first error message:

This certificate cannot be processed while it is in an error state. Fix any errors, and then click Retry.

This happens when requesting a certificate for which the enrollment was previously failing. For example, if your CA fails while enrolling a certificate, then you may see something like:

unable to retrieve: Unexpected status code on TPP Certificate Retrieval. Status: 500 Certificate \VED\Policy\TLS/SSL\aexample.com has encountered an error while processing, Status: Post CSR failed with error: Cannot connect to the certificate authority (CA)., Stage: 500.

After this enrollment failure, any request for that same certificate will invariably lead to the following error:

unable to retrieve: Unexpected status code on TPP Certificate Retrieval. Status: 500 Certificate \VED\Policy\TLS/SSL\aexample.com has encountered an error while processing, Status: This certificate cannot be processed while it is in an error state. Fix any errors, and then click Retry., Stage: 500.

I have been working on a fix in https://github.com/Venafi/vcert/pull/269.