gocd / kubernetes-elastic-agents

Kubernetes Elastic agent plugin for GoCD
https://www.gocd.org
Apache License 2.0
34 stars 32 forks source link

Job not being scheduled to elastic agent #56

Closed cfontes closed 6 years ago

cfontes commented 6 years ago

Hello guys, first thanks for the plugin!!!

So I am using the GoCD Chart to deploy my cluster and also elastic actors, I've followed the steps to configure the plugin and it was working with the demo that comes with the chart that uses elastic agents

So I created the following profile, name jdk-8-mvn-helm which is the bundle I need.

`apiVersion: v1 kind: Pod metadata: name: jdk8-{{ POD_POSTFIX }} labels: app: web spec: containers:

And added it to my Job, so it now shows Elastic Profile Id as jdk-8-mvn-helm

When it is started, I see the creation of the container in my minikube, but my job never runs and just hang forever in an unassigned state.

In the server logs I get this exception 2018-07-19 14:00:04,674 WARN [qtp32863545-29] HttpChannel:568 - /go java.lang.IllegalStateException: Committed at org.eclipse.jetty.server.HttpChannel.resetBuffer(HttpChannel.java:841) at org.eclipse.jetty.server.HttpOutput$Interceptor.resetBuffer(HttpOutput.java:116) at org.eclipse.jetty.server.HttpOutput.resetBuffer(HttpOutput.java:928) at org.eclipse.jetty.server.Response.resetBuffer(Response.java:1312) at org.eclipse.jetty.server.Response.sendRedirect(Response.java:720) at org.eclipse.jetty.server.Response.sendRedirect(Response.java:729) at org.eclipse.jetty.server.handler.ContextHandler.checkContext(ContextHandler.java:1048) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1100) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:527) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) at org.eclipse.jetty.server.Server.handle(Server.java:530) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102) at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:247) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:140) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131) at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:382) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708) at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626) at java.lang.Thread.run(Thread.java:748) 2018-07-19 14:00:05,047 WARN [qtp32863545-25] HttpChannel:568 - /go java.lang.IllegalStateException: Committed at org.eclipse.jetty.server.HttpChannel.resetBuffer(HttpChannel.java:841) at org.eclipse.jetty.server.HttpOutput$Interceptor.resetBuffer(HttpOutput.java:116) at org.eclipse.jetty.server.HttpOutput.resetBuffer(HttpOutput.java:928) at org.eclipse.jetty.server.Response.resetBuffer(Response.java:1312) at org.eclipse.jetty.server.Response.sendRedirect(Response.java:720) at org.eclipse.jetty.server.Response.sendRedirect(Response.java:729) at org.eclipse.jetty.server.handler.ContextHandler.checkContext(ContextHandler.java:1048) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1100) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:527) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) at org.eclipse.jetty.server.Server.handle(Server.java:530) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102) at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:247) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:140) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131) at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:382) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708) at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626) at java.lang.Thread.run(Thread.java:748)

here is the log before this exception on a manual start

`2018-07-19 13:59:57,124 INFO [qtp32863545-23] PipelineTriggerService:44 - [Pipeline Schedule] [Requested] Manual trigger of pipeline 'StocksCash' requested by anonymous 2018-07-19 13:59:57,127 INFO [qtp32863545-23] PipelineTriggerService:47 - [Pipeline Schedule] [Accepted] Manual trigger of pipeline 'StocksCash' accepted for user anonymous 2018-07-19 13:59:57,131 INFO [qtp32863545-23] PipelineTriggerService:49 - [Pipeline Schedule] [Processed] Manual trigger of pipeline 'StocksCash' processed with result 'com.thoughtworks.go.serverhealth.ServerHealthState@1ede2708[healthStateLevel=OK,type=<HealthStateType ARTIFACTS_DISK_FULL LogScope[GLOBAL, scope=GLOBAL]>,message=,description=,expiryTime=,timestamp=Thu Jul 19 13:59:57 GMT 2018]' 2018-07-19 13:59:59,884 INFO [ThreadPoolTaskScheduler-10] ScheduleService:155 - [Pipeline Schedule] Scheduling pipeline StocksCash with build cause [ManualForcedBuildCause: Forced by anonymous] 2018-07-19 13:59:59,889 INFO [ThreadPoolTaskScheduler-10] PipelineRepository:86 - Start updating pipeline timeline 2018-07-19 13:59:59,893 INFO [ThreadPoolTaskScheduler-10] PipelineRepository:92 - Pipeline timeline updated 2018-07-19 14:00:00,569 INFO [152@MessageListener for ServerPingListener] p.c.g.c.e.k.c.g.c.e.KubernetesPlugin:73 [plugin-cd.go.contrib.elasticagent.kubernetes] - [refresh-pod-state] Pod information successfully synced. All(Running/Pending) pod count is 4. 2018-07-19 14:00:03,749 INFO [149@MessageListener for CreateAgentListener] p.c.g.c.e.k.c.g.c.e.KubernetesPlugin:73 [plugin-cd.go.contrib.elasticagent.kubernetes] - [refresh-pod-state] Pod information successfully synced. All(Running/Pending) pod count is 4. 2018-07-19 14:00:03,759 INFO [149@MessageListener for CreateAgentListener] p.c.g.c.e.k.c.g.c.e.KubernetesPlugin:73 [plugin-cd.go.contrib.elasticagent.kubernetes] - [Create Agent] Creating K8s pod with spec: Pod(apiVersion=v1, kind=Pod, metadata=ObjectMeta(annotations={Privileged=true, MaxCPU=, Environment=GO_SERVER_URL=https://10.98.122.134:8154/go GO_EA_SERVER_URL=https://10.98.122.134:8154/go, Elastic-Agent-Job-Identifier={"pipeline_name":"StocksCash","pipeline_counter":34,"pipeline_label":"34","stage_name":"Build","stage_counter":"1","job_name":"Test","job_id":74}, Image=travix/gocd-agent-gcloud-jdk-8:18.6.0, PodConfiguration=apiVersion: v1 kind: Pod metadata: name: jdk8-{{ POD_POSTFIX }} labels: app: web spec: containers:

arvindsv commented 6 years ago

I could be wrong, but the chosen image (travix/gocd-agent-gcloud-jdk-8) doesn't seem to be setup for elastic agents. So, it's probably coming up, but doesn't know how to autoregister using the environment variables that the plugin is providing while starting it.

This section in the docker elastic agent's README mentions what is needed to use a custom docker image with elastic agents. This is how those variables are used in GoCD's official docker image.

Maybe we need to mention the same in the README of these elastic agents. /cc @sheroy

sheroy commented 6 years ago

@cfontes, this is correct. We'll add these instructions in the k8s Elastic Agent README. For now, can you confirm that using the official GoCD agent Docker image works for you?

cfontes commented 6 years ago

@arvindsv Thanks for the details

@sheroy just tried with gocd/gocd-agent-docker-dind:v18.6.0 and it works.

I've seen that guide while troubleshooting it yesterday, but I it was not clear to me with my limited knoweldge what I was missing or why. I will try to adapt it and try again.

arvindsv commented 6 years ago

@sheroy If it's possible to provide a template for converting a non-elastic-agent Docker image to an elastic agent one, then we should. I mean, if it's just the creation of an autoregister.properties file in the right place, we might be able to do that. Let me know if you're ok taking a look at it.

@cfontes The agent is the same. It's just that in the context of it being started from an elastic agent plugin, the environment variables it needs to use to connect to the GoCD server are different. The Docker images which are used in the elastic agent context make sure that the environment variables are used.

cfontes commented 6 years ago

@arvindsv

I just got it working. I love this feature, thank you for it!

Would be great to have a way to do it in the Helm Chart, I've added a config map in my case with the script to generate the config file, but the agent container needs to run the script so that requires some specific changes to it.

Some kind of side car, or init container that you could just copy paste maybe.

sheroy commented 6 years ago

@arvindsv Yes, I can look into this soon.

@cfontes, can I also catch up with you at some point to talk to you about your use case and how you are using our K8S plugin?

cfontes commented 6 years ago

@sheroy, as soon as I have something more solid we sure can.

Thanks I will close this issue ;)