GoogleContainerTools / skaffold

Easy and Repeatable Kubernetes Development
https://skaffold.dev/
Apache License 2.0
15.05k stars 1.62k forks source link

Skaffold V2 seems to hang indefinitely when deploying Pod K8s Objects (most of the time) #7667

Closed aaron-prindle closed 2 years ago

aaron-prindle commented 2 years ago

When using the Skaffold V2 binary, it seems that status checking perhaps has a regression where when deploying Pods it can hang forever. This is a regression from Skaffold V1. Here are /examples/* applications where this occurs (attempt skaffold dev on one of them and it should hang waiting for deployments):

When attempting to use skaffold dev on these exmaples (which deploy a Pod), the deploy will hang with:

Waiting for deployments to stabilize...
aaron-prindle commented 2 years ago

Seems this r.resources is always 0 here in V2 for Pods which is incorrect which is causing this issue: https://github.com/GoogleContainerTools/skaffold/blob/main/pkg/skaffold/kubernetes/status/resource/deployment.go#L145-L147

func (r *Resource) checkStandalonePodsStatus(ctx context.Context, cfg kubectl.Config) *proto.ActionableErr {
    if len(r.resources) == 0 {
        return &proto.ActionableErr{ErrCode: proto.StatusCode_STATUSCHECK_STANDALONE_PODS_PENDING}
    }
aaron-prindle commented 2 years ago

Seems the root issue is that somehow in V2 the r.namespaces are empty for Pods when they should have a default entry. This is causing this double for loop to do nothing: https://github.com/GoogleContainerTools/skaffold/blob/main/pkg/diag/diag.go#L75-L76

func (d *diag) Run(ctx context.Context) ([]validator.Resource, error) {
    var (
        res  []validator.Resource
        errs []error
    )
    // get selector from labels
    selector := labels.SelectorFromSet(d.labels)
    listOptions := metav1.ListOptions{
        LabelSelector: selector.String(),
    }

    for _, v := range d.validators {
        for _, ns := range d.namespaces {
                        # never getting here as d.namespaces is empty in V2 binary atm
            r, err := v.Validate(ctx, ns, listOptions)
tejal29 commented 2 years ago

I was able to figure out, why I never bumped into this issue. I had get namesapce in my active context. When running skaffold, for me this LOC always returned "default" https://github.com/GoogleContainerTools/skaffold/blob/bcbdfe043c2f334f919fa2e6ae06aed4a7578486/pkg/skaffold/deploy/util/namespaces.go#L37

After unsetting namespace in my active context, i was able to reproduce this.

tejal29 commented 2 years ago

I think this should help. https://github.com/GoogleContainerTools/skaffold/pull/7691