Closed chiragthaker closed 5 years ago
[ool-9-thread-13] c.n.s.c.o.DefaultOrchestrationProcessor : java.lang.NullPointerException at com.netflix.spinnaker.clouddriver.kubernetes.v2.description.manifest.KubernetesManifestAnnotater.getTraffic(KubernetesManifestAnnotater.java:231) at com.netflix.spinnaker.clouddriver.kubernetes.v2.op.manifest.AbstractKubernetesEnableDisableManifestOperation.determineLoadBalancers(AbstractKubernetesEnableDisableManifestOperation.java:73) at com.netflix.spinnaker.clouddriver.kubernetes.v2.op.manifest.AbstractKubernetesEnableDisableManifestOperation.operate(AbstractKubernetesEnableDisableManifestOperation.java:131) at com.netflix.spinnaker.clouddriver.kubernetes.v2.op.manifest.AbstractKubernetesEnableDisableManifestOperation.operate(AbstractKubernetesEnableDisableManifestOperation.java:39) at com.netflix.spinnaker.clouddriver.orchestration.AtomicOperation$operate.call(Unknown Source) at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47) at com.netflix.spinnaker.clouddriver.orchestration.AtomicOperation$operate.call(Unknown Source) at com.netflix.spinnaker.clouddriver.orchestration.DefaultOrchestrationProcessor$_process_closure1$_closure2.doCall(DefaultOrchestrationProcessor.groovy:89) at com.netflix.spinnaker.clouddriver.orchestration.DefaultOrchestrationProcessor$_process_closure1$_closure2.doCall(DefaultOrchestrationProcessor.groovy) at sun.reflect.GeneratedMethodAccessor559.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:101) at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323) at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:263) at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1041) at groovy.lang.Closure.call(Closure.java:405) at groovy.lang.Closure.call(Closure.java:399) at com.netflix.spinnaker.clouddriver.metrics.TimedCallable$ClosureWrapper.call(TimedCallable.groovy:55) at com.netflix.spinnaker.clouddriver.metrics.TimedCallable.call(TimedCallable.groovy:82) at java_util_concurrent_Callable$call.call(Unknown Source) at com.netflix.spinnaker.clouddriver.orchestration.DefaultOrchestrationProcessor$_process_closure1.doCall(DefaultOrchestrationProcessor.groovy:88) at com.netflix.spinnaker.clouddriver.orchestration.DefaultOrchestrationProcessor$_process_closure1.doCall(DefaultOrchestrationProcessor.groovy) at sun.reflect.GeneratedMethodAccessor556.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:101) at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323) at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:263) at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1041) at groovy.lang.Closure.call(Closure.java:405) at groovy.lang.Closure.call(Closure.java:399) at com.netflix.spinnaker.security.AuthenticatedRequest.lambda$propagate$0(AuthenticatedRequest.java:129) at com.netflix.spinnaker.clouddriver.metrics.TimedCallable.call(TimedCallable.groovy:82) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
When you are not using one of the deployment strategies, does the deploy manifest stage complete and the clusters tab become populated with the new replica set?
Also, what account & namespace are you deploying to? Spinnaker for GCP is configured to not index the spinnaker
namespace of the spinnaker-install-acount
. So if you are deploying to that account, you have to explicitly specify namespaces for your resources.
Hi @duftler : When you are not using one of the deployment strategies, does the deploy manifest stage complete and the clusters tab become populated with the new replica set? - No it takes around 12 minutes or even more sometime to populate the replicaset , so i believe its some caching issue ?
Also, what account & namespace are you deploying to? Spinnaker for GCP is configured to not index the spinnaker namespace of the spinnaker-install-acount. So if you are deploying to that account, you have to explicitly specify namespaces for your resources. -- We are not using this default namespace and account , we have a different namespace and account configured for deployment
Also for the k8 service account we have enabled --live-manifests calls which helped in significantly reduce the force cache refresh thingy for every job and hence descreasing deployment time significantly
@maggieneterval fyi
Hi @chiragthaker -- The NPE you posted above suggests that Clouddriver is unable to read annotations from your manifest. A couple of questions to help us get to the bottom of this:
Hi @maggieneterval : Spinnaker version 1.15.3
Deploy manifest stage json :
{ "account": "spinnaker-app-test-deploy-account", "cloudProvider": "kubernetes", "manifestArtifactAccount": "github-artifact-acc", "manifestArtifactId": "b04ea0c9-43a7-4ae4-8465-6cc83479f9f8", "moniker": { "app": "sample" }, "name": "Deploy Application", "relationships": { "loadBalancers": [], "securityGroups": [] }, "requiredArtifactIds": [], "skipExpressionEvaluation": false, "source": "artifact", "trafficManagement": { "enabled": true, "options": { "enableTraffic": true, "namespace": "demo-app", "services": [ "service demo-app" ], "strategy": "highlander" } }, "type": "deployManifest" }
Thanks @chiragthaker, would you also mind posting your full manifest YAML? Thanks!
Hi @maggieneterval : here is the deployment yaml for our config. We have seperate yaml for ingress , services and namespace but this is deployment yaml
apiVersion: apps/v1 kind: ReplicaSet metadata: name: demo-app-replicaset namespace: demo-app spec: selector: matchLabels: name: demo-app-replicaset replicas: 1 # tells deployment to run 2 pods matching the template template: metadata: labels: name: demo-app-replicaset spec: containers:
Thanks! Your config all looks good, not sure at first glance why your deploy is failing. I can dig a little deeper into this later this week, in the meantime could you let me know how you added your Kubernetes account to Spinnaker for GCP, and how you upgraded your Spinnaker version to 1.15?
Hi @maggieneterval : we used the managed scripts for both the stuff , for instance
Here is the one for adding spinnaker - k8 account thing : https://github.com/GoogleCloudPlatform/spinnaker-for-gcp/blob/master/scripts/manage/add_gke_account.sh
For upgrading : https://github.com/GoogleCloudPlatform/spinnaker-for-gcp/blob/master/scripts/manage/update_spinnaker_version.sh
@maggieneterval : Deploy fails on 2nd run when i perform 2 deployments back to back.
So flow goes like this .
So main issue here is successive deployment in like span of 5 mins doesn't work with rolling strategies which i feel leads me to some wierd caching issue.
Hope this explains exact scenario,
Hi @maggieneterval : You had a chance of looking at this ?
So main issue here is successive deployment in like span of 5 mins doesn't work with rolling strategies which i feel leads me to some wierd caching issue.
Thanks for clarifying the issue you're facing -- to confirm, do you have liveManifestCalls
enabled? Spinnaker-managed rollout strategies rely on caching and so are not compatible with live manifest mode being enabled.
Hi @maggieneterval : Yes liveManifestCalls is enabled and actually we had to ( else force cache refresh) stage would take around 12 mins per stage , dont think that was good solution for us.
Thanks for letting me know, I'm sorry that the force cache refresh task is taking so long. Do you notice any errors in your Orca or Clouddriver logs during the force cache refresh? Unfortunately for the time being you will need to choose between either enabling liveManifestCalls
or using Spinnaker-managed rollout strategies, but hopefully we can address the root cause of the long force cache refresh so you are able to disable liveManifestCalls
and use the strategies.
Looks from my read of this thread like this can be closed (since we are not intending to add support for rollout strategies with liveManifestCalls
enabled).
Please re-open if I've misread this somehow.
On deploying new replica set via kubernetes v2 using highlander strategy or any other strategy , infrastructure tab doesnt pick up the deployment status once done instantly and takes like 10-15 mins to pop up or something it just disappers totally,
We are stuck with this and trying to resolve it.,