dolthub / dolt

Dolt – Git for Data
Apache License 2.0
17.97k stars 514 forks source link

Running a dolt sql server as a primary and a replica on the same instance causes read-only issues #5924

Open bheni opened 1 year ago

bheni commented 1 year ago

Setup the directories and configs:

mkdir replication_test
cd replication_test
mkdir primary
mkdir standby1
echo 'log_level: info
max_logged_query_len: 2048
behavior:
  read_only: false
  autocommit: true
  persistence_behavior: load
  disable_client_multi_statements: true
user:
  name: test
  password: test
listener:
  host: 0.0.0.0
  port: 3306
databases: []
performance:
  query_parallelism: 2
metrics:
  labels:
    process: dolt_sql_server
  host: 127.0.0.1
  port: 8888
cluster:
  standby_remotes:
  - name: standby1
    remote_url_template: http://127.0.0.1:50052/{database}
  bootstrap_role: primary
  bootstrap_epoch: 1
  remotesapi:
    address: 0.0.0.0
    port: 50051' > primary/config.yaml
echo 'log_level: info
max_logged_query_len: 2048
behavior:
  read_only: false
  autocommit: true
  persistence_behavior: load
  disable_client_multi_statements: true
user:
  name: test
  password: test
listener:
  host: 0.0.0.0
  port: 3307
databases: []
performance:
  query_parallelism: 2
metrics:
  labels:
    process: dolt_sql_server
  host: 127.0.0.1
  port: 8889
cluster:
  standby_remotes:
  - name: primary
    remote_url_template: http://127.0.0.1:50051/{database}
  bootstrap_role: standby
  bootstrap_epoch: 1
  remotesapi:
    address: 0.0.0.0
    port: 50052' > standby1/config.yaml

Then in two separate terminal windows with one in the primary directory and one in the standby1 directory run dolt sql-server --config config.yaml In a third terminal window connect to the primary and try to create a database and table:

~>mysql -h127.0.0.1 -p3306 -utest -ptest
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.7.9-Vitess

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MySQL [(none)]> create database db;
Query OK, 1 row affected (0.063 sec)

MySQL [(none)]> use db;
Database changed
MySQL [db]> create table t (
    -> pk int primary key,
    -> c1 varchar(8));
ERROR 1105 (HY000): Database db is read-only.

If you log into standby1 instead of primary you will now see that the empty database "db" was replicated.

I don't know if I have a config issue, but even if I do the behavior is not great.

reltuk commented 1 year ago

I think is probably caused by the second sql server coming up as a primary and the cluster entering detected_broken_config state. The actual cluster role is stored in a persisted SQL variable, and my understanding is that those are typically shared between sql-server invocations in different directories because they're stored in HOME/DOLT_ROOT_PATH.

reltuk commented 1 year ago

This might also be a particular quirk of how cluster replication stores the persistent role and epoch. I'm not sure if it stores it the exact same way as the default SQL path for persistent global variables does.

bheni commented 1 year ago

I was able to get this working by setting DOLT_ROOT_PATH to a different directory for each running server process.