Closed keithmattix closed 2 years ago
Dependent on #5044
Implementation
NOTE: For the in-scope scenarios, all status updates are the same. The implementation could be something like the following:
- Receive MRC add event
- Check there are no more than 2 MRCs (
mrcClient.ListMeshRootCertificates()
). If there are more than 2 the add event should be ignored. This might happen if multiple MRCs are created at the same time and therefore all pass the validating webhookCall RetryOnConflict. Define retry func
- Get the MRC
- Check that the status is nil, and if it is not nil return nil
- Set the status
- Update the MRC status (
mrcClient.UpdateMeshRootCertificateStatus
)- Return err value (either error or nil)
- If RetryOnConflict failed, log the error (this is a fatal error - not sure how this should be handled?)
- If RetryOnConflict exceeded continue execution
@keithmattix What are your thoughts on how we should handle step 2? Should the validating webhook be responsible for limiting the number of active/passive MRCs that are added? If somehow more than 2 MRCs exist, how should the controller respond? Should it set a special condition on the MRC, or should the MRC be in an error state?
The validating webhook should limit MRCs to 2. If by some chance, an MRC gets past the webhook, the controller should ignore add events if there are 2 MRCs with status/conditions set
With this + upgrades we have more reasons for an internal state object. With 2 phase commit we can completely avoid the scenario of having multiple mesh certs in the wrong state
With this + upgrades we have more reasons for an internal state object. With 2 phase commit we can completely avoid the scenario of having multiple mesh certs in the wrong state
I'm not sold on 2pc honestly; from my light reading, I have concerns about reads during the "promise" stage of the commit. The saga pattern seems more widely adopted, and something like nats would let us implement that, but honestly, the more prevalent pattern in Kubernetes is a lock/lease for each control plane component (each of which could have multiple replicas) so that there's only one writer. I'd much prefer that approach because we could guarantee that the writer's view of the world is correct.
Closing in favor of https://github.com/openservicemesh/osm/issues/5179
Based on Step 2 in the design doc, osm-controller should monitor the MRCAdded event and initiate the status fields (including conditions). Furthermore, osm-controller should confirm the connection to the issuer and update the
Accepted
condition accordingly (error or success).Be aware that conflicts may occur when multiple osm-controller replicas attempt to update status at the same time; consider using RetryOnConflict to query k8s for the latest version of the resource, add/update the status if it hasn't been done already (return early if it has), and retry with backoff if a conflict error occurs.
Refer to #4848 for examples of where (
handleMRCEvent
inmanager.go
) and how to respond to MRC events. Note that #4848 is based on an old version of MRC statuses and state changes, but the general approach is still applicable.An MRCAdded event will be received in the following scenarios:
spec.intent=active
status.state
or if it is already set. Retry if not set and no errorstatus.state
should be set toPending
status.certificateStatuses
should all be set toUnknown
status.conditions
Accepted
should be set tostatus=False
andreason=Pending
status.state
should be set toIssuing
status.certificateStatuses
should all be set toIssuing
status.conditions
Accepted
should be set tostatus=True
andreason=CertificateAccepted
status.conditions
IssuingRollout
should be set tostatus=True
andreason=CertificateInUseForIssuing
status.conditions
ValidatingRollout
should be set tostatus=True
andreason=CertificateInUseForValidating
status.conditions
Ready
should be set tostatus=True
andreason=RotationComplete
spec.intent=passive
,status.state
should be set toPending
status.certificateStatuses
should all be set toUnknown
status.conditions
Accepted
should be set tostatus=False
andreason=Pending
spec.intent=active
status.state
should be set toPending
status.certificateStatuses
should all be set toUnknown
status.conditions
Accepted
should be set tostatus=False
andreason=Pending
Implementation
NOTE: For the in-scope scenarios, all status updates are the same. The implementation could be something like the following:
mrcClient.ListMeshRootCertificates()
). If there are more than 2 the add event should be ignored. This might happen if multiple MRCs are created at the same time and therefore all pass the validating webhookmrcClient.UpdateMeshRootCertificateStatus
)