gardener / machine-controller-manager

Declarative way of managing machines for Kubernetes cluster
Apache License 2.0
257 stars 117 forks source link

Improve error handling when `InitializeMachine` method is not yet implement by a provider #934

Closed ashwani2k closed 2 months ago

ashwani2k commented 3 months ago

How to categorize this issue?

/area logging|ops-productivity /kind bug /priority 2

What happened: Currently the InitializeMachine is not implemented by every provider other than AWS. However as the call to the provider implementation results in Unimplemented error type, MCM logs the error and doesn't check the semantics to rather ignore this error. https://github.com/gardener/machine-controller-manager/blob/ff8261398277c3e5a481f0[…]fb57c417dfd07754/pkg/util/provider/machinecontroller/machine.go.

  klog.V(3).Infof("Initializing VM instance for Machine %q", machine.Name)
    resp, err := c.driver.InitializeMachine(ctx, req)
    if err != nil {
        errStatus, ok := status.FromError(err)
        if !ok {
            klog.Errorf("Cannot decode Driver error for machine %q: %s. Unexpected behaviour as Driver errors are expected to be of type status.Status", machine.Name, err)
            return machineutils.LongRetry, err
        }
        klog.Errorf("Error occurred while initializing VM instance for machine %q: %s", machine.Name, err)
        switch errStatus.Code() {
        case codes.Unimplemented:
            klog.V(3).Infof("Provider does not support Driver.InitializeMachine - skipping VM instance initialization for %q.", machine.Name)

Can we mark this as an Info/Warning message instead of error.

What you expected to happen: The system to ignore the error when the method is not yet implemented and just log the behavior as an Info log. How to reproduce it (as minimally and precisely as possible): Create a machine in Azure where this method is not implemented and you should see the logs containing this error.

Anything else we need to know?:

Environment:

gardener-robot commented 3 months ago

@ashwani2k Label area/logging|ops-productivity does not exist.