QubitProducts / bamboo

HAProxy auto configuration and auto service discovery for Mesos Marathon
Apache License 2.0
793 stars 214 forks source link

error in template #161

Closed bateau84 closed 8 years ago

bateau84 commented 8 years ago

Hello! im getting a strange error.

2015-09-08 13:17:53,879 DEBG 'bamboo' stdout output:
2015/09/08 13:17:53 Failed to retrieve template data
2015/09/08 13:17:53 Failed to update HAProxy configuration: invalid character '<' looking for beginning of value

im using the default template in this repo config dir.

lclarkmichalek commented 8 years ago

I ran into this in production yesterday, restarted the nodes solved it for me. Now, that's not to say this isn't a bug, but I have a feeling that getting the cluster into a funky state is the most reliable way to replicate it. Can you post any more details about the rest of your setup (number of mseso masters, their state, same for marathon, etc)

bateau84 commented 8 years ago

My setup consists of 1 server that runs mesos master, marathon, zookeeper and bamboo. 2 slaves. Marathon version 0.10.0 Mesos version 0.23.0 (slaves and master). Bamboo is running latest commit to master. I also tried the reload_rework branch, but with same result.

Bamboo and HAProxy is running in a docker (version 1.8.1) container and the config of bamboo is set with environment variables set with docker run. docker exec bamboo env shows:

PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=****************
MARATHON_ENDPOINT=http://10.0.0.11:8080
MARATHON_USER=**********
MARATHON_PASSWORD=*************
BAMBOO_ENDPOINT=http://10.0.0.11:8000
BAMBOO_ZK_HOST=10.0.0.11
BAMBOO_ZK_PATH=/bamboo/state
HAPROXY_TEMPLATE_PATH=/etc/bamboo/haproxy_template.cfg
HAPROXY_OUTPUT_PATH=/etc/haproxy/haproxy.cfg
HAPROXY_RELOAD_CMD=haproxy -f /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -D -sf $(cat /var/run/haproxy.pid)
STATSD_ENABLED=false
DEBIAN_FRONTEND=noninteractive
GOPATH=/opt/go
HOME=/root
lclarkmichalek commented 8 years ago

Ah that sucks, I was hoping to blame the issue on a split brain or something outside our control. However this sounds more and more like just a straight up bug in Bamboo. Probably in the zk sync stuff. Don't suppose you could take a dump of you bamboo ZK path, clear it out, and then try again? Though it if it's only 1 node it really shouldn't be getting too confused. I'm really not sure on this one.

bateau84 commented 8 years ago

I changed the ZK path to /bamboo/config instead of /bamboo/state. It did not change the outcome, same error still. I allso pulled the merged changes just now and rebuild the docker image.

lclarkmichalek commented 8 years ago

Hmm, not sure then. I'll probably defer to @activars; this is not part of the code I'm overly familiar with (i.e. most of it).

j1n6 commented 8 years ago

looks like bamboo running in container failed to contact or get information from Marathon. Can you tap into container see if you can curl Marathon endpoint?

If you can, it might be a compatibility issue for marathon 0.10.0.

jdubs commented 8 years ago

Hi, I'm also seeing some issues that is similar When tailing the marathon logs, I see a 401

Sep 8 22:25:05 mesos-master1-env marathon[20306]: [2015-09-08 22:25:05,421] INFO 10.25.123.123 - - [08/Sep/2015:22:25:05 +0000] "GET /v2/tasks HTTP/1.1" 401 1281 "-" "Go 1.1 package http" (mesosphere.chaos.http.ChaosRequestLog$$EnhancerByGuice$$23e8919e:15)

I've tried either passing in the Marathon.User & Marathon.Password at run time and via the config file and ENV var but it still results in a 401.

It appears that http client isn't passing in the basic http auth creds when making the GET and marathon is sending back some html, and the json parser is hitting the first '<' which is causing it to bail.

https://github.com/QubitProducts/bamboo/blob/eb4923fbcdcd271334b919d4ff9a299056ddb8a0/services/marathon/marathon.go#L135

POST to /v2/eventSubscriptions succeeds with a 200 as it's passing in the auth credentials..

bateau84 commented 8 years ago

@activars I tried curl from within the container, and it returned 200 OK. I think @jdubs is on to something here. in main/bamboo/bamboo.go: "func registerMarathonEvent" sets username and password for marathon, but when i check services/marathon/marathon.go it looks like it does not set username and password in "func fetchTasks" or "func fetchMarathonApps". im not familiar with the code or go, so this is just a longshot (hipshot) from me

jdubs commented 8 years ago

I believe this patch should fix it. https://github.com/QubitProducts/bamboo/pull/162

bateau84 commented 8 years ago

Thanks! i will check this out first thing in the morning! :D

bateau84 commented 8 years ago

Fixed :+1:

j1n6 commented 8 years ago

thanks for the confirmation :)

giuffresoft79 commented 8 years ago

i have this error.... how can i resolve it?