autopilotpattern / mysql

Implementation of the autopilot pattern for MySQL
Mozilla Public License 2.0
172 stars 68 forks source link

Not starting cluster, can't find mysql-primary #105

Closed bhechinger closed 6 years ago

bhechinger commented 6 years ago

So I'm trying to get this setup and I'm running into this issue:

2018/06/19 19:45:08 DEBUG manage Starting new HTTP connection (1): consul
2018/06/19 19:45:08 DEBUG manage http://consul:8500 "GET /v1/health/service/mysql-primary?passing=1 HTTP/1.1" 200 2
2018/06/19 19:45:08 DEBUG manage []
2018/06/19 19:45:08 DEBUG manage could not determine primary via Consul: No primary found
2018/06/19 19:45:08 DEBUG manage [health] consul.read_lock start
2018/06/19 19:45:08 DEBUG manage http://consul:8500 "GET /v1/kv/mysql-primary HTTP/1.1" 200 171
2018/06/19 19:45:08 DEBUG manage [health] consul.read_lock end: (u'e3aeb4b2-8463-1dee-7fc9-c67b4e303a8d', 'mysql-98580368fd27')
2018/06/19 19:45:08 DEBUG manage [health] node.is_primary end: True

For whatever reason the mysql-primary service never gets created.

I do have a mysql service in consul, however. Could this just be getting named incorrectly?

Thanks!!

-brian

bhechinger commented 6 years ago

docker-compose is:

  mysql:
    image: autopilotpattern/mysql:${TAG:-latest}
    scale: 2
    labels:
      - triton.cns.services=mysql
      - com.docker.swarm.affinities=["container!=~*mysql*"]
    mem_limit: 128m
    restart: always
    expose:
      - 3306
    network_mode: chremoas_dev
    environment:
      - MYSQL_USER=dbuser
      - MYSQL_PASSWORD=seekretPassword
      - MYSQL_REPL_USER=repluser
      - MYSQL_REPL_PASSWORD=seekretReplPassword
      - MYSQL_DATABASE=demodb
      - BACKUP_TTL=120
      - LOG_LEVEL=DEBUG
      - CONSUL=consul
      - SNAPSHOT_BACKEND=manta
      - MANTA_URL=https://us-east.manta.joyent.com
      - MANTA_USER=sdfsdfsdf
      - MANTA_KEY_ID=sdfsdfsdf
      - MANTA_PRIVATE_KEY=chremoas_dev-mysql_id_rsa
      - MANTA_BUCKET=~~/stor/chremoas_dev-mysql
dfredell commented 6 years ago

@bhechinger In a healthy HA system there will be one mysql-primary service and X mysql replication services in consul. I would check if there is a value and lock on /kv/mysql-primary.

I sometimes get into this situation when the initial Snapshot doesn't get successfully written to the snapshot backend (minio in my case). I normally run docker exec <inst> python /usr/local/bin/manage.py on_change to get the consul kv in sync with the services.

Or checkout https://github.com/certusoft/mysql as Joyent has given up supporting the autopilotpattern, for now.

bhechinger commented 6 years ago

@dfredell That was exactly it, thanks!!

I'm upset about them not supporting autopilotpattern, it seems really nice from what's I've seen so far.

bhechinger commented 6 years ago

Well, gave certusoft/mysql a go and it still doesn't want to come up healthy. I've punted for now and will just use a mariadb infrastructure container. :(

dfredell commented 6 years ago

:( autopilotpattern/mysql is such a good idea. I wish it was better supported.