MariaDB Replication Is Broken

gramineproject / gramine

A library OS for Linux multi-process applications, with Intel SGX support

GNU Lesser General Public License v3.0

606 stars 201 forks source link

MariaDB Replication Is Broken #1921

Closed harunkucuk5 closed 1 week ago

harunkucuk5 commented 5 months ago

Description of the problem

when we use MariaDB replication with Gramine it fails with the following error

Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MariaDB code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave

Without Gramine all is good, but with Gramine we have this error. This is not also related to the sgx because we get the same error with gramine-direct as well.

Steps to reproduce

I used this https://github.com/enclaive/enclaive-docker-mariadb-sgx repository to reproduce this issue but the issue also exists on the Gramine 1.5,1.6 and 1.7 versions as well.

Clone the repository

git pull https://github.com/enclaive/enclaive-docker-mariadb-sgx.git

Build docker image

docker build -t gramine-mariadb:v2 .

then create 2 instances with this docker image and create replication between them.

docker run -it --name master --security-opt seccomp=/root/docker_seccomp_aug_2022.json gramine-mariadb:v2
docker run -it --name replica --security-opt seccomp=/root/docker_seccomp_aug_2022.json gramine-mariadb:v2

CHANGE MASTER TO MASTER_HOST='<master-ip>', MASTER_USER='<user>',MASTER_PASSWORD='<password>',MASTER_USE_GTID = slave_pos;

Once the replication is started all is good. But some time later it fails with this error. You can just run the following sysbench test and reproduce this replication error

sysbench oltp_read_only --mysql-host=127.0.0.1 --tables=10 --table_size=50000  --mysql-port=3306 --mysql-user=<user>--mysql-password="<password>" --mysql-db=sbtest  prepare

Expected results

A running replication stream

Actual results

replication status

MariaDB [sbtest]> show slave status \G
*************************** 1. row ***************************
                Slave_IO_State:
                   Master_Host: 172.17.0.2
                   Master_User: root
                   Master_Port: 3306
                 Connect_Retry: 60
               Master_Log_File: ON.000001
           Read_Master_Log_Pos: 7060656
                Relay_Log_File: mysqld-relay-bin.000002
                 Relay_Log_Pos: 5487680
         Relay_Master_Log_File: ON.000001
              Slave_IO_Running: No
             Slave_SQL_Running: No
               Replicate_Do_DB:
           Replicate_Ignore_DB:
            Replicate_Do_Table:
        Replicate_Ignore_Table:
       Replicate_Wild_Do_Table:
   Replicate_Wild_Ignore_Table:
                    Last_Errno: 1594
                    Last_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MariaDB code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
                  Skip_Counter: 0
           Exec_Master_Log_Pos: 5487388
               Relay_Log_Space: 7061258
               Until_Condition: None
                Until_Log_File:
                 Until_Log_Pos: 0
            Master_SSL_Allowed: No
            Master_SSL_CA_File:
            Master_SSL_CA_Path:
               Master_SSL_Cert:
             Master_SSL_Cipher:
                Master_SSL_Key:
         Seconds_Behind_Master: NULL
 Master_SSL_Verify_Server_Cert: No
                 Last_IO_Errno: 1595
                 Last_IO_Error: Relay log write failure: could not queue event from master
                Last_SQL_Errno: 1594
                Last_SQL_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MariaDB code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
   Replicate_Ignore_Server_Ids:
              Master_Server_Id: 1
                Master_SSL_Crl:
            Master_SSL_Crlpath:
                    Using_Gtid: Slave_Pos
                   Gtid_IO_Pos: 0-1-33
       Replicate_Do_Domain_Ids:
   Replicate_Ignore_Domain_Ids:
                 Parallel_Mode: optimistic
                     SQL_Delay: 0
           SQL_Remaining_Delay: NULL
       Slave_SQL_Running_State:
              Slave_DDL_Groups: 14
Slave_Non_Transactional_Groups: 0
    Slave_Transactional_Groups: 17

if we check the relay logs with the following command we see they are corrupted.


 show relaylog events IN 'mysqld-relay-bin.000002';
ERROR 1220 (HY000): Error when executing command SHOW BINLOG EVENTS: Wrong offset or I/O error

 show relaylog events IN 'mysqld-relay-bin.000002' limit 10,3;
+-------------------------+------+------------+-----------+-------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Log_name                | Pos  | Event_type | Server_id | End_log_pos | Info                                                                                                                                                                                                                                                                                                                                                      |
+-------------------------+------+------------+-----------+-------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| mysqld-relay-bin.000002 |  961 | Query      |         1 |         756 | create database test                                                                                                                                                                                                                                                                                                                                      |
| mysqld-relay-bin.000002 | 1048 | Gtid       |         1 |         798 | GTID 0-1-4                                                                                                                                                                                                                                                                                                                                                |
| mysqld-relay-bin.000002 | 1090 | Query      |         1 |        1207 | use `test`; CREATE TABLE test_table (
    id INT AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(255) NOT NULL,
    email VARCHAR(255) NOT NULL,
    phone VARCHAR(20),
    address VARCHAR(255),
    city VARCHAR(100),
    state VARCHAR(100),
    postal_code VARCHAR(20),
    country VARCHAR(100),
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
) |
+-------------------------+------+------------+-----------+-------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3 rows in set (0.000 sec)

logs:

2024-06-24 11:51:51 9 [Note] Slave I/O thread: connected to master 'root@172.17.0.2:3306',replication starts at GTID position ''
2024-06-24 13:32:14 10 [ERROR] Error in Log_event::read_log_event(): 'Replication event checksum verification failed while reading from a log file', data_len: 524293, event_type: 2
2024-06-24 13:32:14 9 [ERROR] Slave I/O: Relay log write failure: could not queue event from master, Internal MariaDB error code: 1595
2024-06-24 13:32:14 9 [Note] Slave I/O thread exiting, read up to log 'ON.000001', position 7060656; GTID position 0-1-33, master 172.17.0.2:3306
2024-06-24 13:32:14 10 [ERROR] Error reading relay log event: slave SQL thread aborted because of I/O error
2024-06-24 13:32:14 10 [ERROR] Slave SQL: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MariaDB code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave. Internal MariaDB error code: 1594
2024-06-24 13:32:14 10 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'ON.000001' position 5487388; GTID position '0-1-30'
2024-06-24 13:32:14 10 [Note] Slave SQL thread exiting, replication stopped in log 'ON.000001' at position 5487388; GTID position '0-1-30', master: 172.17.0.2:3306
2024-06-24 21:32:43 11 [Warning] Aborted connection 11 to db: 'sbtest' user: 'root' host: '127.0.0.1' (Got timeout reading communication packets)

Gramine commit hash

10e93534169802be16fc9e2b3e9ac70d08efcb41

dimakuv commented 5 months ago

Where is this log mysqld-relay-bin.000002 located? In which absolute path?

harunkucuk5 commented 5 months ago

it is on the same path where MariaDB data is stored.

root@b74866364bac:/data# pwd
/data
root@b74866364bac:/data# ls -la
total 131148
drwx------ 7 root root      4096 Jun 24 13:30 .
drwxr-xr-x 1 root root      4096 Jun 24 11:44 ..
-rw-r----- 1 root root      4096 Jun 24 11:44 ON.000001
-rw-r----- 1 root root      4096 Jun 24 11:44 ON.index
-rw-rw---- 1 root root    430080 Jun 24 11:51 aria_log.00000001
-rw-rw---- 1 root root      4096 Jun 24 11:44 aria_log_control
-rw-r----- 1 root root     24576 Jun 24 13:32 ddl_recovery.log
-rw-rw---- 1 root root      4096 Jun 24 11:44 ib_buffer_pool
-rw-r----- 1 root root 101715968 Jun 24 13:32 ib_logfile0
-rw-rw---- 1 root root  12718080 Jun 24 11:44 ibdata1
-rw-r----- 1 root root  12718080 Jun 24 11:44 ibtmp1
-rw-r----- 1 root root         0 Jun 24 11:51 master.info
-rw-r----- 1 root root      4096 Jun 24 11:44 multi-master.info
drwx------ 2 root root      4096 Jun 24 11:44 mysql
-rw-r--r-- 1 root root      4096 Jun 24 11:44 mysql_upgrade_info
-rw-r----- 1 root root      4096 Jun 24 11:51 mysqld-relay-bin.000001
-rw-r----- 1 root root   6610944 Jun 25 08:56 mysqld-relay-bin.000002
-rw-r----- 1 root root      4096 Jun 24 11:51 mysqld-relay-bin.index
drwx------ 2 root root      4096 Jun 24 11:44 performance_schema
-rw-r----- 1 root root      4096 Jun 24 11:51 relay-log.info
drwx------ 2 root root      4096 Jun 24 13:32 sbtest
drwx------ 2 root root     12288 Jun 24 11:44 sys
drwx------ 2 root root      4096 Jun 24 13:29 test

dimakuv commented 5 months ago

The manifest file in the linked repo (https://github.com/enclaive/enclaive-docker-mariadb-sgx/blob/main/mariadb.manifest.template) doesn't seem to have the /data/ path.

Could you show your manifest file that you're currently using?

harunkucuk5 commented 5 months ago

it is here actually, I am using the same file

https://github.com/enclaive/enclaive-docker-mariadb-sgx/blob/eb6df106232db159ed52a0bde2aaf3a574ab5540/mariadb.manifest.template#L20

harunkucuk5 commented 5 months ago

only changes I have made on that repository are;

use gramine-direct instead of gramine-sgx
enable binary logs

git diff
diff --git a/conf/my.cnf b/conf/my.cnf
index 6161d85..c77d492 100644
--- a/conf/my.cnf
+++ b/conf/my.cnf
@@ -14,6 +14,7 @@ basedir                 = /usr
 tmpdir                  = /tmp
 bind-address            = 0.0.0.0
 skip-external-locking
+log_bin                        = ON

 character-set-server    = utf8mb4
 collation-server        = utf8mb4_unicode_ci
diff --git a/entrypoint.sh b/entrypoint.sh
index 543afaa..b8f4027 100755
--- a/entrypoint.sh
+++ b/entrypoint.sh
@@ -2,5 +2,5 @@

 /aesmd.sh

-gramine-sgx-get-token --output mariadb.token --sig mariadb.sig
-gramine-sgx mariadb
+#gramine-sgx-get-token --output mariadb.token --sig mariadb.sig
+gramine-direct mariadb

dimakuv commented 5 months ago

Ah, sorry, I missed line 20 in the manifest file. Now I can see that all files under /data/ are encrypted by Gramine.

This is unfortunately not supported currently: Encrypted Files are assumed to be used by a single Gramine instance (single SGX enclave). In your case, your encrypted file is used by two Gramine instances.

Even though the encrypted key is the same, the internal format of Encrypted Files is single-process as of now. For example, the file size is cached inside the Gramine instance and is not updated from the outside. This means that one Gramine instance can for example update the encrypted file from 4KB to 8KB, but the other Gramine instance will still think that the encrypted file is only 4KB in size.

So I think this is what happens in your MariaDB replication case: two MariaDB replicas try to access the same encrypted file in parallel. This is currently not supported, and I'm not sure Gramine will ever support such complex cases.

harunkucuk5 commented 5 months ago

but if I do not encrypt the data volume I still get the same error. So would not this solve this issue if it is because of that?

I have made this change on the manifest and build another image and followed the same steps and I still have the same issue.

git diff mariadb.manifest.template
diff --git a/mariadb.manifest.template b/mariadb.manifest.template
index 185ae17..7d50467 100644
--- a/mariadb.manifest.template
+++ b/mariadb.manifest.template
@@ -17,7 +17,7 @@ fs.mounts = [
     { path = "/lib",              uri = "file:{{ gramine.runtimedir() }}" },
     { path = "{{ arch_libdir }}", uri = "file:{{ arch_libdir }}" },
     { path = "/app",              uri = "file:/app" },
-    { path = "/var/lib/mysql",    uri = "file:/data", type = "encrypted", key_name = "default" },
+    { path = "/var/lib/mysql",    uri = "file:/data" },
 ]
 fs.insecure__keys.default = "ffeeddccbbaa99887766554433221100"

manifest file in the docker container:

grep data mariadb.manifest.template
    { path = "/var/lib/mysql",    uri = "file:/data" },

the error;

MariaDB [sbtest]> show slave status \G
*************************** 1. row ***************************
                Slave_IO_State:
                   Master_Host: 172.17.0.2
                   Master_User: root
                   Master_Port: 3306
                 Connect_Retry: 60
               Master_Log_File: ON.000001
           Read_Master_Log_Pos: 2098883
                Relay_Log_File: mysqld-relay-bin.000002
                 Relay_Log_Pos: 1050305
         Relay_Master_Log_File: ON.000001
              Slave_IO_Running: No
             Slave_SQL_Running: No
               Replicate_Do_DB:
           Replicate_Ignore_DB:
            Replicate_Do_Table:
        Replicate_Ignore_Table:
       Replicate_Wild_Do_Table:
   Replicate_Wild_Ignore_Table:
                    Last_Errno: 1594
                    Last_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MariaDB code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
                  Skip_Counter: 0
           Exec_Master_Log_Pos: 1050013
               Relay_Log_Space: 2099485
               Until_Condition: None
                Until_Log_File:
                 Until_Log_Pos: 0
            Master_SSL_Allowed: No
            Master_SSL_CA_File:
            Master_SSL_CA_Path:
               Master_SSL_Cert:
             Master_SSL_Cipher:
                Master_SSL_Key:
         Seconds_Behind_Master: NULL
 Master_SSL_Verify_Server_Cert: No
                 Last_IO_Errno: 1595
                 Last_IO_Error: Relay log write failure: could not queue event from master
                Last_SQL_Errno: 1594
                Last_SQL_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MariaDB code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
   Replicate_Ignore_Server_Ids:
              Master_Server_Id: 1
                Master_SSL_Crl:
            Master_SSL_Crlpath:
                    Using_Gtid: Slave_Pos
                   Gtid_IO_Pos: 0-1-9
       Replicate_Do_Domain_Ids:
   Replicate_Ignore_Domain_Ids:
                 Parallel_Mode: optimistic
                     SQL_Delay: 0
           SQL_Remaining_Delay: NULL
       Slave_SQL_Running_State:
              Slave_DDL_Groups: 5
Slave_Non_Transactional_Groups: 0
    Slave_Transactional_Groups: 3

dimakuv commented 5 months ago

You also need to specify that whole directory as sgx.allowed_files then:

sgx.allowed_files = [
    "file:/var/lib/mysql",
]

Otherwise Gramine doesn't know what kind of files are contained in this directory (not encrypted, not trusted and not allowed -- so Gramine conservatively forbids access to these files).

Note that this is only for debugging purposes, this is insecure!

Also, did you purge all your Docker images and containers? It looks to me like you still have the old files.

harunkucuk5 commented 5 months ago

but I did not remove /var/lib/mysql from the mounts completely, I still have the following one. Should I still add this path to the allowed_files which I do not have for other mounts and the database is working properly and Gramine accepts database operations like new database creation, new table creation, etc?

 { path = "/var/lib/mysql",    uri = "file:/data" },

Also, did you purge all your Docker images and containers? It looks to me like you still have the old files.

Yes, I did. Since I use the same configuration it creates new files with the same names

dimakuv commented 5 months ago

but I did not remove /var/lib/mysql from the mounts completely, I still have the following one. Should I still add this path to the allowed_files which I do not have for other mounts and the database is working properly and Gramine accepts database operations like new database creation, new table creation, etc?

Yes, that's normal. It's because some FS mounts are self-sufficient (the ones with encrypted and tmpfs types). Some other FS mounts need additional hints about the kind of files that are contained inside of them (the default FS mounts), so here you need to specify additionally either sgx.trusted_files or sgx.allowed_files.

harunkucuk5 commented 5 months ago

ok, I added it to trusted_files and purged all remnants from the older tests and re-run the same steps, however I still have the same issue,

 git diff mariadb.manifest.template
diff --git a/mariadb.manifest.template b/mariadb.manifest.template
index 185ae17..16a87ce 100644
--- a/mariadb.manifest.template
+++ b/mariadb.manifest.template
@@ -17,7 +17,7 @@ fs.mounts = [
     { path = "/lib",              uri = "file:{{ gramine.runtimedir() }}" },
     { path = "{{ arch_libdir }}", uri = "file:{{ arch_libdir }}" },
     { path = "/app",              uri = "file:/app" },
-    { path = "/var/lib/mysql",    uri = "file:/data", type = "encrypted", key_name = "default" },
+    { path = "/var/lib/mysql",    uri = "file:/data" },
 ]
 fs.insecure__keys.default = "ffeeddccbbaa99887766554433221100"

@@ -38,6 +38,7 @@ sgx.trusted_files = [
     "file:/app/mariadbd",
     "file:/app/init.sql",
     "file:/app/mysql.tar.gz",
+    "file:/var/lib/mysql/",

     "file:/etc/my.cnf",
     "file:/usr/lib/mysql/plugin/",

the same error,

MariaDB [sbtest]> show slave status \G
*************************** 1. row ***************************
                Slave_IO_State:
                   Master_Host: 172.17.0.2
                   Master_User: root
                   Master_Port: 3306
                 Connect_Retry: 60
               Master_Log_File: ON.000001
           Read_Master_Log_Pos: 2098754
                Relay_Log_File: mysqld-relay-bin.000002
                 Relay_Log_Pos: 1050176
         Relay_Master_Log_File: ON.000001
              Slave_IO_Running: No
             Slave_SQL_Running: No
               Replicate_Do_DB:
           Replicate_Ignore_DB:
            Replicate_Do_Table:
        Replicate_Ignore_Table:
       Replicate_Wild_Do_Table:
   Replicate_Wild_Ignore_Table:
                    Last_Errno: 1594
                    Last_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MariaDB code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
                  Skip_Counter: 0

harunkucuk5 commented 5 months ago

btw, regarding the following statement, I have a single replica only and it downloads the binary log from the master over TCP protocol and replays the logs on the replica. So there are no multiple replicas here.

So I think this is what happens in your MariaDB replication case: two MariaDB replicas try to access the same encrypted file in parallel. This is currently not supported, and I'm not sure Gramine will ever support such complex cases.

dimakuv commented 5 months ago

ok, I added it to trusted_files and purged all remnants from the older tests and re-run the same steps, however I still have the same issue

Unfortunately you can't add this path as sgx.trusted_files, because Trusted Files are enforced by Gramine to be read-only. Thus you won't be able to create these files and to write into them.

By the way, since you added that path to sgx.trusted_files and /var/lib/mysql/mysqld-relay-bin.000002 still contained some data (if I understand the output of MariaDB tools correctly), this implies that you're still using some old data/files? This file would have never been created by Gramine, because it is marked as read-only...

I have a single replica only and it downloads the binary log from the master over TCP protocol and replays the logs on the replica

Oh, so the master process writes to /var/lib/mysql/mysqld-relay-bin.000002, and then the replica connects to the master over TCP and requests this file, and master reads this file and sends to the replica?

If I understand the scenario correctly, this means that the master process both writes to the "relay" file and also reads from the same "relay" file simultaneously.
It feels like this scenario should be supported by Gramine.
Could you dive deeper into this issue? Maybe perform some additional experiments, maybe understand the exact implementation of these processes, and maybe collect "trace" Gramine logs? Without more info, it's hard to reason what goes wrong.

harunkucuk5 commented 5 months ago

Unfortunately you can't add this path as sgx.trusted_files, because Trusted Files are enforced by Gramine to be read-only. Thus you won't be able to create these files and to write into them.

By the way, since you added that path to sgx.trusted_files and /var/lib/mysql/mysqld-relay-bin.000002 still contained some data (if I understand the output of MariaDB tools correctly), this implies that you're still using some old data/files? This file would have never been created by Gramine, because it is marked as read-only...

No, I added that path to trusted_files and the database can create new files on that path. This is probably because it is the sgx options but I use gramine-direct not gramine-sgx. Maybe gramine-direct bypasses those options? So those files are not already there when I create the container, they are created bythe MariaDB database which runs in Gramine container. Anyway, I will move the path to sgx.allowed_files and test again. But not sure if sgx.allowed_files will be used by gramine-direct.

Oh, so the master process writes to /var/lib/mysql/mysqld-relay-bin.000002, and then the replica connects to the master over TCP and requests this file, and master reads this file and sends to the replica?

no, master creates its binary log which is called ON.000001 in my setup, and the replica downloads this file and creates mysqld-relay-bin.000002 relay log file and then executes the changes on the replica.

harunkucuk5 commented 5 months ago

I moved the path to the sgx.allowed_files but the result is the same,

the database files in the replica before I create a replication stream, you can see the relay log file is not here yet since there is no replication.

/data# ls -la
total 12752
drwx------ 5 root root      4096 Jun 26 10:30 .
drwxr-xr-x 1 root root      4096 Jun 26 10:30 ..
-rw-r----- 1 root root       627 Jun 26 10:30 ON.000001
-rw-r----- 1 root root        12 Jun 26 10:30 ON.index
-rw-rw---- 1 root root    417792 Jun 26 10:30 aria_log.00000001
-rw-rw---- 1 root root        52 Jun 26 10:30 aria_log_control
-rw-r----- 1 root root         9 Jun 26 10:30 ddl_recovery.log
-rw-rw---- 1 root root       906 Jun 26 10:30 ib_buffer_pool
-rw-r----- 1 root root 100663296 Jun 26 10:30 ib_logfile0
-rw-rw---- 1 root root  12582912 Jun 26 10:30 ibdata1
-rw-r----- 1 root root  12582912 Jun 26 10:30 ibtmp1
-rw-r----- 1 root root         0 Jun 26 10:30 multi-master.info
drwx------ 2 root root      4096 Jun 26 10:30 mysql
-rw-r--r-- 1 root root        15 Jun 26 10:30 mysql_upgrade_info
drwx------ 2 root root      4096 Jun 26 10:30 performance_schema
drwx------ 2 root root     12288 Jun 26 10:30 sys

And once I create replication stream and create a test database on the master, the database files on the replica. Here the relay log file is created by the MariaDB since we have a replication stream,

drwx------ 6 root root      4096 Jun 26 10:37 .
drwxr-xr-x 1 root root      4096 Jun 26 10:30 ..
-rw-r----- 1 root root       627 Jun 26 10:30 ON.000001
-rw-r----- 1 root root        12 Jun 26 10:30 ON.index
-rw-rw---- 1 root root    417792 Jun 26 10:35 aria_log.00000001
-rw-rw---- 1 root root        52 Jun 26 10:30 aria_log_control
-rw-r----- 1 root root         9 Jun 26 10:30 ddl_recovery.log
-rw-rw---- 1 root root       906 Jun 26 10:30 ib_buffer_pool
-rw-r----- 1 root root 100663296 Jun 26 10:37 ib_logfile0
-rw-rw---- 1 root root  12582912 Jun 26 10:30 ibdata1
-rw-r----- 1 root root  12582912 Jun 26 10:30 ibtmp1
-rw-r----- 1 root root       134 Jun 26 10:35 master.info
-rw-r----- 1 root root         0 Jun 26 10:30 multi-master.info
drwx------ 2 root root      4096 Jun 26 10:30 mysql
-rw-r--r-- 1 root root        15 Jun 26 10:30 mysql_upgrade_info
-rw-r----- 1 root root       310 Jun 26 10:35 mysqld-relay-bin.000001
-rw-r----- 1 root root      1052 Jun 26 10:37 mysqld-relay-bin.000002
-rw-r----- 1 root root        52 Jun 26 10:35 mysqld-relay-bin.index
drwx------ 2 root root      4096 Jun 26 10:30 performance_schema
-rw-r----- 1 root root        46 Jun 26 10:35 relay-log.info
drwx------ 2 root root      4096 Jun 26 10:37 sbtest
drwx------ 2 root root     12288 Jun 26 10:30 sys

manifest file;

grep -A 3 allowed mariadb.manifest.template
sgx.allowed_files = [
    "file:/var/lib/mysql/",
]

and the same replication issue,

MariaDB [sbtest]> show slave status \G
*************************** 1. row ***************************
                Slave_IO_State:
                   Master_Host: 172.17.0.2
                   Master_User: root
                   Master_Port: 3306
                 Connect_Retry: 60
               Master_Log_File: ON.000001
           Read_Master_Log_Pos: 2098754
                Relay_Log_File: mysqld-relay-bin.000002
                 Relay_Log_Pos: 1050176
         Relay_Master_Log_File: ON.000001
              Slave_IO_Running: No
             Slave_SQL_Running: No
               Replicate_Do_DB:
           Replicate_Ignore_DB:
            Replicate_Do_Table:
        Replicate_Ignore_Table:
       Replicate_Wild_Do_Table:
   Replicate_Wild_Ignore_Table:
                    Last_Errno: 1594
                    Last_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MariaDB code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
                  Skip_Counter: 0
           Exec_Master_Log_Pos: 1049884
               Relay_Log_Space: 2099356
               Until_Condition: None
                Until_Log_File:
                 Until_Log_Pos: 0
            Master_SSL_Allowed: No
            Master_SSL_CA_File:
            Master_SSL_CA_Path:
               Master_SSL_Cert:
             Master_SSL_Cipher:
                Master_SSL_Key:
         Seconds_Behind_Master: NULL
 Master_SSL_Verify_Server_Cert: No
                 Last_IO_Errno: 1595
                 Last_IO_Error: Relay log write failure: could not queue event from master
                Last_SQL_Errno: 1594
                Last_SQL_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MariaDB code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
   Replicate_Ignore_Server_Ids:
              Master_Server_Id: 1
                Master_SSL_Crl:
            Master_SSL_Crlpath:
                    Using_Gtid: Slave_Pos
                   Gtid_IO_Pos: 0-1-8
       Replicate_Do_Domain_Ids:

dimakuv commented 5 months ago

No, I added that path to trusted_files and the database can create new files on that path. This is probably because it is the sgx options but I use gramine-direct not gramine-sgx. Maybe gramine-direct bypasses those options?

Sorry, yes, my stupid mistake. I forgot that you run with gramine-direct, where indeed Trusted/Allowed Files make no difference.

no, master creates its binary log which is called ON.000001 in my setup, and the replica downloads this file and creates mysqld-relay-bin.000002 relay log file and then executes the changes on the replica.

Can you check if the file ON.000001 on the master is not corrupted? We need to understand whether the corruption happens (1) on master when it writes the file, (2) during TCP download, or (3) on replica when it reads the file.

harunkucuk5 commented 5 months ago

yes, ON.000001 is not corrupted. I verified this as I can read all entries from that file and also if I set up a slave that does not run in Gramine it replicates from the master in Gramine without any issues. However, a replica in Gramine fails with the error.

dimakuv commented 5 months ago

However, a replica in Gramine fails with the error.

Could you do some more investigation about whether the problem is in (2) or in (3) -- see my previous comment.

harunkucuk5 commented 5 months ago

on the same network when I create another replica without Gramine, it downloads the binary log from the Gramine master without any issues.

Another test to eliminate (2) is; if I use mariadb-binlog utility to download the master binary log file to the replica that runs in gramine container it downloads without any issues with the following command.

I think these 2 tests show that the issue is not related to (2)

./mariadb-binlog --read-from-remote-server --host=172.17.0.2 --user=<sanitized> --password=<sanitized> ON.000001 > binlog
root@sek-idp-gramine-test:/tmp# echo $?
0

root@sek-idp-gramine-test:/tmp# head binlog
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!40019 SET @@session.max_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 4
#240626 10:29:55 server id 1  end_log_pos 256 CRC32 0xe65ed405  Start: binlog v 4, server v 10.6.19-MariaDB-log created 240626 10:29:55 at startup
ROLLBACK/*!*/;
BINLOG '
o+17Zg8BAAAA/AAAAAABAAAAAAQAMTAuNi4xOS1NYXJpYURCLWxvZwAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAACj7XtmEzgNAAgAEgAEBAQEEgAA5AAEGggAAAAICAgCAAAACgoKAAAAAAAA

dimakuv commented 5 months ago

@harunkucuk5 Ok, the problem seems quite complicated...

Do you think you can take a look into the MariaDB code and see how exactly the download + the reading from the downloaded file happens? Very is something low level going on.

Also, can you show the "trace" logs of Gramine on the MariaDB replica?

harunkucuk5 commented 4 months ago

@dimakuv, the MariaDB replication has 2 threads. the first one is responsible for downloading the binary log files from the master (IO_THREAD) and the other one(SQL_THREAD) is responsible for reading the binary logs from this relay log file and applying them to the replica database.

if I set up replication and stop SQL_THREAD, the replica downloads the binary logs without issues. If I stop the IO_THREAD and start SQL_THREAD, the SQL_THREAD applies the downloaded binary logs without issues. But if I let them run at the same time it ends up with the replication error that I shared before.

I compiled mariadb with debug and reproduced the issue. That time it crashed with the following trace.

mariadbd: /server/mysys/mf_iocache.c:1723: my_b_flush_io_cache: Assertion `info->end_of_file == inline_mysql_file_tell("/server/mysys/mf_iocache.c", 1723, info->file, (myf) (0))' failed.
240703 11:51:27 [ERROR] mysqld got signal 6 ;
Sorry, we probably made a mistake, and this is a bug.

Your assistance in bug reporting will enable us to fix this for the next release.
To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

Server version: 10.6.19-MariaDB-debug-log source revision: f6fcfc1a6a058fd7cac6bf53216ea73f3a04b22d
key_buffer_size=134217728
read_buffer_size=131072
max_used_connections=1
max_threads=153
thread_count=3
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 468166 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x24582abdb098
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x24582abd9b48 thread_stack 0x49000
/app/mariadbd(my_print_stacktrace+0x44)[0x2458481fa5e9]
/app/mariadbd(handle_fatal_signal+0x421)[0x2458477d7b76]
libc_sigaction.c:0(__restore_rt)[0x2458461b8ed0]
pthread_kill.c:0(__pthread_kill_implementation)[0x245846204766]
/lib/libc.so.6(raise+0x16)[0x2458461b8e26]
/lib/libc.so.6(abort+0xc5)[0x2458461a37e5]
loadmsgcat.c:0(_nl_load_domain.cold)[0x2458461a371b]
/lib/libc.so.6(+0x3acc6)[0x2458461b1cc6]
/app/mariadbd(my_b_flush_io_cache+0x2a4)[0x2458481ce3ec]
/app/mariadbd(_ZN13MYSQL_BIN_LOG14flush_and_syncEPb+0x110)[0x245847937cc2]
/app/mariadbd(_ZN13MYSQL_BIN_LOG18write_event_bufferEPhj+0x509)[0x245847937ad5]
/app/mariadbd(+0x8f4353)[0x2458472f1353]
/app/mariadbd(handle_slave_io+0x13c2)[0x2458472e9f0f]
/app/mariadbd(+0x1198ccb)[0x245847b95ccb]
pthread_create.c:0(start_thread)[0x245846202a17]
/lib/libc.so.6(clone+0x51)[0x245846284d61]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x0): (null)
Connection ID (thread ID): 6
Status: NOT_KILLED

Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off,hash_join_cardinality=off,cset_narrowing=off

The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mariadbd/ contains
information that should help you find out what is causing the crash.

so it points this part of the MariaDB server code;

https://github.com/MariaDB/server/blob/7c5fdc9b6a45f4e8bad1ae5c7c93b457be9891fa/mysys/mf_iocache.c#L1804-L1808

        info->end_of_file= eof;
        info->append_read_pos= info->write_buffer;
        DBUG_ASSERT(info->end_of_file == mysql_file_tell(info->file, MYF(0)));
      }
      else

it is in mf_iocache.c and my_b_flush_io_cache function.

it has the following comments section of this file here,

/* Cashing of files with only does (sequential) read or writes of fixed- length records. A read isn't allowed to go over file-length. A read is ok if it ends at file-length and next read can try to read after file-length (and get a EOF-error). Possibly use of asyncronic io. macros for read and writes for faster io. Used instead of FILE when reading or writing whole files. One can change info->pos_in_file to a higher value to skip bytes in file if also info->read_pos is set to info->read_end. If called through open_cached_file(), then the temporary file will only be created if a write exeeds the file buffer or if one calls my_b_flush_io_cache().

If one uses SEQ_READ_APPEND, then two buffers are allocated, one for reading and another for writing. Reads are first done from disk and then done from the write buffer. This is an efficient way to read from a log file when one is writing to it at the same time. For this to work, the file has to be opened in append mode! Note that when one uses SEQ_READ_APPEND, one MUST write using my_b_append ! This is needed because we need to lock the mutex every time we access the write buffer.

TODO: When one SEQ_READ_APPEND and we are reading and writing at the same time, each time the write buffer gets full and it's written to disk, we will always do a disk read to read a part of the buffer from disk to the read buffer. This should be fixed so that when we do a my_b_flush_io_cache() and we have been reading the write buffer, we should transfer the rest of the write buffer to the read buffer before we start to reuse it. */

Basically, there is a writer thread and a reader thread. They both use the same file. There is also a buffer, the reader thread first reads from the disk and then reads the buffer. Maybe reading the buffer with Gramine causes some conflicts.

I will keep reading the code, but please note that I am not a c developer and an expert on that topic. But will try to find some more details.

I also set the Gramine log level to trace and re-produced the issue again. Sending the logs in the attachment. replica_logs.tar.gz

dimakuv commented 4 months ago

Thanks @harunkucuk5, this was a great exploration on your part! Without your thorough experiments and debugging, we wouldn't root cause this bug.

So I got it. Gramine currently does not support O_APPEND explicitly, under the assumption that the current update-pos-on-write imitates O_APPEND semantics. It worked in most cases we observed in the wild, but it breaks in this particular case with MariaDB/MySQL. Here is the relevant part of the log you shared, with my comments of what goes wrong:

# previous (successful) write to the file fd=47, appends to the end of file, pos is correct
[P1:T14:mariadbd] trace: ---- lseek(47, 0xac1a9f, SEEK_SET) = 0xac1a9f
[P1:T14:mariadbd] trace: ---- write(47, 0x23335ee25a88, 0x1f) ...
[P1:T14:mariadbd] trace: ---- return from write(...) = 0x1f
[P1:T14:mariadbd] trace: ---- lseek(47, 0x0, SEEK_CUR) = 0xac1abe
[P1:T14:mariadbd] trace: ---- lseek(47, 0x0, SEEK_CUR) = 0xac1abe
[P1:T14:mariadbd] trace: ---- lseek(47, 0x0, SEEK_END) = 0xac1abe
[P1:T14:mariadbd] trace: ---- lseek(47, 0x0, SEEK_CUR) = 0xac1abe
[P1:T14:mariadbd] trace: ---- lseek(47, 0xac1abe, SEEK_SET) = 0xac1abe

# read in-between writes from the file fd=47, reads from a specific (previous) pos, so pos is updated inside Gramine
[P1:T15:mariadbd] trace: ---- lseek(47, 0x943000, SEEK_SET) = 0x943000
[P1:T15:mariadbd] trace: ---- read(47, 0x23335eb336d9, 0x7e000) ...
[P1:T15:mariadbd] trace: ---- return from read(...) = 0x7e000
[P1:T15:mariadbd] trace: ---- read(47, 0x23335ee23a88, 0x2000) ...
[P1:T15:mariadbd] trace: ---- return from read(...) = 0x2000

# last (failing) write to the file fd=47, DOES NOT APPEND but writes to an updated-by-reader pos (somewhere in the middle of the file)
[P1:T14:mariadbd] trace: ---- write(47, 0x23335ee25a88, 0x2a) ...
[P1:T14:mariadbd] trace: ---- return from write(...) = 0x2a
[P1:T14:mariadbd] trace: ---- lseek(47, 0x0, SEEK_CUR) = 0x9c302a
[P1:T14:mariadbd] trace: ---- lseek(47, 0x0, SEEK_CUR) = 0x9c302a
[P1:T14:mariadbd] trace: ---- write(2, 0x23335eed1ba0, 0xbb) ...
mariadbd: /server/mysys/mf_iocache.c:1723: my_b_flush_io_cache: Assertion `info->end_of_file == inline_mysql_file_tell("/server/mysys/mf_iocache.c", 1723, info->file, (myf) (0))' failed.

Basically, there is a reader thread that messes with the current position (pos) in the file, and the writer doesn't update this position on write() (which it technically should -- updating to the total file size -- if opened with O_APPEND).

I'll create a draft PR with a quick fix now, just to check that your problem will be solved.

dimakuv commented 4 months ago

@harunkucuk5 Can you try PR https://github.com/gramineproject/gramine/pull/1935 ?

harunkucuk5 commented 4 months ago

@dimakuv , thanks for the quick PR. I tried to test this however probably I am missing something as I am getting the following error when I try to create container with the image

docker run --name replica --security-opt seccomp=/root/docker_seccomp_aug_2022.json gsc-gramine-mariadb:v17-unsigned bash
Invalid application path specified (/gramine/app_files/entrypoint.manifest.sgx does not exist).
The path should point to application configuration files, so that they can be
found after appending corresponding extensions.

So what I did to use your branch,

Modified the docker file of that enclaive-docker-mariadb-sgx project. Changed the enclaive/gramine-os image to ubuntu image as I will use gsc to graminize the image with your branch

~/enclaive-docker-mariadb-sgx# git diff Dockerfile
diff --git a/Dockerfile b/Dockerfile
index d013e6f..5acd5d0 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -66,7 +66,7 @@ RUN gcc main.c -Wall -Wextra -Werror -Wpedantic -Wno-unused-parameter -O3 \

 # final stage

-FROM enclaive/gramine-os:jammy-7e9d6925
+FROM ubuntu:jammy

 RUN apt-get update \
     && apt-get install -y --no-install-recommends liblzo2-2 \
@@ -86,10 +86,6 @@ COPY --from=middlemain /build/launcher /app/
 COPY ./mariadb.manifest.template ./entrypoint.sh ./conf/init.sql /app/
 COPY ./conf/my.cnf /etc/

-RUN gramine-argv-serializer "/app/mariadbd" "--init-file=/app/init.sql" > ./argv \
-    && gramine-manifest -Darch_libdir=/lib/x86_64-linux-gnu mariadb.manifest.template mariadb.manifest \
-    && gramine-sgx-sign --key "$SGX_SIGNER_KEY" --manifest mariadb.manifest --output mariadb.manifest.sgx
-
 VOLUME /data/

 EXPOSE 3306/tcp

Also modified the manifest file as well since it got error message like "gramine is not a known variable"

~/enclaive-docker-mariadb-sgx# git diff mariadb.manifest.template
diff --git a/mariadb.manifest.template b/mariadb.manifest.template
index 185ae17..8a922fe 100644
--- a/mariadb.manifest.template
+++ b/mariadb.manifest.template
@@ -1,9 +1,7 @@
-loader.entrypoint = "file:{{ gramine.libos }}"
-
 libos.entrypoint = "/app/launcher"
 loader.argv_src_file = "file:/app/argv"

-loader.log_level = "error"
+loader.log_level = "trace"
 #loader.log_file = "log.txt"

 loader.env.LD_LIBRARY_PATH = "/lib:{{ arch_libdir }}"
@@ -14,10 +12,10 @@ fs.mounts = [
     { path = "/run/mysqld",       uri = "file:/run/mysqld", type = "tmpfs" },
     { path = "/etc",              uri = "file:/etc" },
     { path = "/usr",              uri = "file:/usr" },
-    { path = "/lib",              uri = "file:{{ gramine.runtimedir() }}" },
+    { path = "/lib",              uri = "file:/usr/local/lib/x86_64-linux-gnu/gramine/runtime" },
     { path = "{{ arch_libdir }}", uri = "file:{{ arch_libdir }}" },
     { path = "/app",              uri = "file:/app" },
-    { path = "/var/lib/mysql",    uri = "file:/data", type = "encrypted", key_name = "default" },
+    { path = "/var/lib/mysql",    uri = "file:/data" },
 ]
 fs.insecure__keys.default = "ffeeddccbbaa99887766554433221100"

@@ -29,8 +27,8 @@ sgx.isvprodid = 1
 sgx.isvsvn    = 1

 sgx.trusted_files = [
-    "file:{{ gramine.libos }}",
-    "file:{{ gramine.runtimedir() }}/",
+    "file:/usr/local/lib/x86_64-linux-gnu/gramine/libsysdb.so",
+    "file:/usr/local/lib/x86_64-linux-gnu/gramine/runtime",
     "file:{{ arch_libdir }}/",

     "file:/app/argv",

And cloned the gsc project and updated the config template to use your branch


~/gsc# git diff
diff --git a/config.yaml.template b/config.yaml.template
index ef64080..a74f374 100644
--- a/config.yaml.template
+++ b/config.yaml.template
@@ -29,7 +29,7 @@ Registry: ""
 # branch is guaranteed to work with current Gramine `master` branch).
 Gramine:
     Repository: "https://github.com/gramineproject/gramine.git"
-    Branch:     "master"
+    Branch:     "dimakuv/add-oappend-only-singleprocess"

 # Specify the Intel SGX driver installed on your machine (more specifically, on the machine where
 # the graminized Docker container will run); there are several variants of the SGX driver:

and then I build docker image with the following command first,

~/enclaive-docker-mariadb-sgx# docker build -t gramine-mariadb:v17 .

and then graminized this image with the following command,

./gsc build -c config.yaml.template gramine-mariadb:v17 /root/enclaive-docker-mariadb-sgx/mariadb.manifest.template

and got the following output,

Successfully tagged gsc-gramine-mariadb:v17-unsigned Successfully built an unsigned graminized Docker image gsc-gramine-mariadb:v17-unsigned from original application image gramine-mariadb:v17.

I have the following images.

 docker images|grep v17
gsc-gramine-mariadb      v17-unsigned   9accc8c3d2b0   38 minutes ago   222MB
gramine-mariadb          v17            889a841b215f   52 minutes ago   112MB

however, when I try to create a container, I get the following error;

docker run --name replica --security-opt seccomp=/root/docker_seccomp_aug_2022.json gsc-gramine-mariadb:v17-unsigned bash
Invalid application path specified (/gramine/app_files/entrypoint.manifest.sgx does not exist).
The path should point to application configuration files, so that they can be
found after appending corresponding extensions.

dimakuv commented 4 months ago

You seem to have forgotten to do the gsc sign-image step. Please check the docs and the example here: https://gramine.readthedocs.io/projects/gsc/en/latest/#example

You need to sign the image (with whatever signing key, you can create a dummy one, see the example in the docs), otherwise GSC won't finalize the resulting Docker image.

harunkucuk5 commented 4 months ago

Thanks @dimakuv , I signed the image and I made it work. Unfortunately, the same issue still exists. I will enable trace logging and send you gramine trace again.

2024-07-05 14:11:27 8 [ERROR] Error in Log_event::read_log_event(): 'Replication event checksum verification failed while reading from a log file', data_len: 524293, event_type: 2
2024-07-05 14:11:27 8 [ERROR] Error reading relay log event: slave SQL thread aborted because of I/O error
2024-07-05 14:11:27 8 [ERROR] Slave SQL: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MariaDB code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave. Internal MariaDB error code: 1594
2024-07-05 14:11:27 8 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'ON.000001' position 26227194; GTID position '0-1-59'
2024-07-05 14:11:27 8 [Note] Slave SQL thread exiting, replication stopped in log 'ON.000001' at position 26227194; GTID position '0-1-59', master: 172.30.43.5:3306
2024-07-05 14:11:27 7 [ERROR] Slave I/O: Relay log write failure: could not queue event from master, Internal MariaDB error code: 1595
2024-07-05 14:11:27 7 [Note] Slave I/O thread exiting, read up to log 'ON.000001', position 26751666; GTID position 0-1-60, master 172.30.43.5:3306

*************************** 1. row ***************************
                Slave_IO_State:
                   Master_Host: 172.30.43.5
                   Master_User: root
                   Master_Port: 3306
                 Connect_Retry: 60
               Master_Log_File: ON.000001
           Read_Master_Log_Pos: 26751666
                Relay_Log_File: mysqld-relay-bin.000002
                 Relay_Log_Pos: 26227486
         Relay_Master_Log_File: ON.000001
              Slave_IO_Running: No
             Slave_SQL_Running: No
               Replicate_Do_DB:
           Replicate_Ignore_DB:
            Replicate_Do_Table:
        Replicate_Ignore_Table:
       Replicate_Wild_Do_Table:
   Replicate_Wild_Ignore_Table:
                    Last_Errno: 1594
                    Last_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MariaDB code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
                  Skip_Counter: 0
           Exec_Master_Log_Pos: 26227194
               Relay_Log_Space: 26752268
               Until_Condition: None
                Until_Log_File:
                 Until_Log_Pos: 0
            Master_SSL_Allowed: No
            Master_SSL_CA_File:
            Master_SSL_CA_Path:
               Master_SSL_Cert:
             Master_SSL_Cipher:
                Master_SSL_Key:
         Seconds_Behind_Master: NULL
 Master_SSL_Verify_Server_Cert: No
                 Last_IO_Errno: 1595
                 Last_IO_Error: Relay log write failure: could not queue event from master
                Last_SQL_Errno: 1594
                Last_SQL_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MariaDB code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
   Replicate_Ignore_Server_Ids:
              Master_Server_Id: 1
                Master_SSL_Crl:
            Master_SSL_Crlpath:
                    Using_Gtid: Slave_Pos
                   Gtid_IO_Pos: 0-1-60
       Replicate_Do_Domain_Ids:
   Replicate_Ignore_Domain_Ids:
                 Parallel_Mode: optimistic
                     SQL_Delay: 0
           SQL_Remaining_Delay: NULL
       Slave_SQL_Running_State:
              Slave_DDL_Groups: 8
Slave_Non_Transactional_Groups: 0
    Slave_Transactional_Groups: 52

harunkucuk5 commented 4 months ago

@dimakuv , the gramine trace logs are in the attached file.

replica_logs.gz

dimakuv commented 4 months ago

@harunkucuk5 Unfortunately I don't find anything interesting in the logs.

Could you build MariaDB in debug mode, like you did previously? Maybe we'll hit some assert, which will give us more information on what's going on.

dimakuv commented 4 months ago

Actually, @harunkucuk5, can you check https://github.com/gramineproject/gramine/pull/1935 again? I modified it today.

harunkucuk5 commented 4 months ago

@dimakuv, thanks for the PR. I could not reproduce the issue with this PR :rocket: I will run broader tests to ensure it is fixed and get back to you. I am adding Gramine trace logs and the MariaDB debug logs in case you need to check something.

replica_logs.gz

harunkucuk5 commented 4 months ago

@dimakuv, it seems it is working without issues. Thanks for the fix

harunkucuk5 commented 4 months ago

Hi @dimakuv, we are currently using Gramine v1.6. Would it be possible to receive this fix in our current version (1.6.x), or would it be advisable to upgrade Gramine to the latest version once the fix becomes available?

dimakuv commented 4 months ago

Would it be possible to receive this fix in our current version (1.6.x), or would it be advisable to upgrade Gramine to the latest version once the fix becomes available?

The latter (upgrade to latest version). We don't do backports, i.e., we will not release a 1.6.x version.

oliver-steinbrecher commented 4 months ago

@dimakuv can you provide transparency when the fix will be officially released?

dimakuv commented 4 months ago

@oliver-steinbrecher If things go well, the next release of Gramine (v1.8) will happen around September. Unless some critical issues will be detected in the code review process for the fix, I expect the fix to be incorporated in that release.

oliver-steinbrecher commented 1 month ago

@dimakuv can you confirm that this fix is included in 1.8 tag ?

molk-ibm commented 1 month ago

@dimakuv : if PR #1935 is required for the fix, v1.8 doesn't include it, as the PR is not merged, yet. right?

dimakuv commented 1 month ago

@oliver-steinbrecher @molk-ibm Correct, the fix is not yet merged. Sorry, this had lower priority in our task queue, and it wasn't yet reviewed by other maintainers.

molk-ibm commented 1 month ago

hey @dimakuv thanks for the reply! do you have any idea, when this might be released?

dimakuv commented 1 month ago

@molk-ibm This depends on reviewers: @kailun-qin and @mkow mainly

mkow commented 3 weeks ago

Probably in the next two/three months, but we're low on resources and high on high-priority tasks, unfortunately...

oliver-steinbrecher commented 3 weeks ago

Can you help to understand what is left or needs to be done in order to integrate it ? I'm asking because we do need this correction for a productive solution. We're considering to build a fork verison now .

woju commented 3 weeks ago

@mkow already explained: it needs time for review. Yes, review takes time and we have limited time allocated. It's as simple as that.