compose / governor

Runners to orchestrate a high-availability PostgreSQL
MIT License
511 stars 75 forks source link

pg_ctl: directory "data/postgres" does not exist #13

Closed tvb closed 9 years ago

tvb commented 9 years ago

When I run a clean install I get no output after executing ./governor.py postgres1.yml however, if I ctrl+c the application I receive the following error:

postgres@sql1:~/governor$ ./governor.py postgres1.yml
^CTraceback (most recent call last):
  File "./governor.py", line 48, in <module>
    time.sleep(5)
KeyboardInterrupt
pg_ctl: directory "data/postgres" does not exist

Why is the data directory not begin created during first time run?

tvb commented 9 years ago

Creating the data directory manually gives a new error:

$ mkdir -p data/postgres
postgres@sql1:~/governor$ ./governor.py postgresql.yml
^CTraceback (most recent call last):
  File "./governor.py", line 48, in <module>
    time.sleep(5)
KeyboardInterrupt
pg_ctl: directory "data/postgres" is not a database cluster directory
tvb commented 9 years ago

So after adding some more debugging I am actually stuck at

        while not synced_from_leader:
            logging.info("I am not in sync")
            leader = etcd.current_leader()
            print(leader)
            if not leader:
                logging.info("I am not the leader, waiting 5 seconds")
                time.sleep(5)
                continue

because my leader is None. Not sure how to recover from that..

tvb commented 9 years ago

Actually on sql1 node:

curl http://127.0.0.1:2379/v2/stats/leader
{"leader":"f3a45927640b6da1","followers":{"8d32f0c7cf61d86f":{"latency":{"current":0.005404,"average":0.005804184899796387,"standardDeviation":0.004551149296781203,"minimum":0,"maximum":0.453886},"counts":{"fail":0,"success":1084939}},"c351a9659cc1ca65":{"latency":{"current":0.004392,"average":0.004885503809053086,"standardDeviation":0.008000549516033555,"minimum":0,"maximum":6.972503},"counts":{"fail":0,"success":1085703}}}}

On the other two nodes:

curl http://127.0.0.1:2379/v2/stats/leader
{"message":"not current leader"}

So the leader is f3a45927640b6da1 which is the sql1 node..

tvb commented 9 years ago

Ok, I had to clear/rm the initialize key as there was probably a previous value defined?

postgres@sql2:~/governor$ etcdctl -o extended get /service/batman/initialize
Key: /service/batman/initialize
Created-Index: 8
Modified-Index: 8
TTL: 0
Etcd-Index: 54
Raft-Index: 520353
Raft-Term: 9

sql1
Winslett commented 9 years ago

Typically, if you are starting a new test with governor, it is best to stop etcd and any governor processes. Then, run rm -rf data/* from the governor directory. Then, restart etcd + the governor processes.