greenlion / warp

WarpSQL Server, an open source OLAP focused distribution of the world's most popular open source database bundled with OLAP performance related plugins such as the WARP storage engine..
http://warpsql.blog
Other
41 stars 2 forks source link

Rows lost when converting InnoDB -> WARP and when using INSERT .. SELECT from an non-WARP table #61

Closed federico-razzoli closed 3 years ago

federico-razzoli commented 3 years ago
mysql> CREATE TABLE t (a INT) ENGINE InnoDB;
Query OK, 0 rows affected (0.03 sec)

mysql> INSERT INTO t VALUES (1);
Query OK, 1 row affected (0.01 sec)

mysql> INSERT INTO t VALUES (2);
Query OK, 1 row affected (0.01 sec)

mysql> SELECT * FROM t;
+------+
| a    |
+------+
|    1 |
|    2 |
+------+
2 rows in set (0.00 sec)

mysql> ALTER TABLE t ENGINE WARP;
Query OK, 2 rows affected (0.06 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM t;
+------+
| a    |
+------+
|    1 |
+------+
1 row in set (0.01 sec)
federico-razzoli commented 3 years ago

Not just InnoDB. The same happens with MyISAM.

s4sukan commented 3 years ago

Same problem but another use case for me,

When try to insert data from InnoDB table, some rows are apparently missing.

mysql> select count(*) from analytics.employees_2020;
+----------+
| count(*) |
+----------+
|  3791851 |
+----------+
1 row in set (0.15 sec)

mysql> insert into warp.employees_2020 select * from analytics.employees_2020;
Query OK, 3791851 rows affected (22.84 sec)
Records: 3791851  Duplicates: 0  Warnings: 0

mysql> select count(*) from warp.employees_2020;
+----------+
| count(*) |
+----------+
|   291844 |
+----------+
1 row in set (0.53 sec)
federico-razzoli commented 3 years ago

@s4sukan It could be a different bug. In your case, the number of affected rows reported by the client is correct. It's possible that the COUNT() is wrong. I suggest you run a normal SELECT and check the number of returned rows.

greenlion commented 3 years ago

@federico-razzoli interesting. I never encountered this problem. I do not have any code for ALTER table. Will investigate. Thank you for the bug report.

greenlion commented 3 years ago

@s4sukan please do me a favor, and in your MySQL data directory: cd analytics/employees_2020.data find . -type d

please attach the output

greenlion commented 3 years ago

related: MySQL localhost ssb_sf10_warp SQL > create table t1(c1 int) engine=innodb; Query OK, 0 rows affected (0.1485 sec) MySQL localhost ssb_sf10_warp SQL > insert into t1 values (1); Query OK, 1 row affected (0.0194 sec) MySQL localhost ssb_sf10_warp SQL > insert into t1 values (2); Query OK, 1 row affected (0.0088 sec) MySQL localhost ssb_sf10_warp SQL > create table t2 engine=warp as select * from t1; Query OK, 2 rows affected (0.0687 sec)

Records: 2 Duplicates: 0 Warnings: 0 MySQL localhost ssb_sf10_warp SQL > select * from t2; Empty set (0.0629 sec)

s4sukan commented 3 years ago

find . -type d

[root@mylab employees_2020.data]#  find . -type d
.
./p0
./p2
./p3
./p4
./p5
./p6
./p7
./p8
./p9
./p10
./p11
./p12
./p13
./p14
./p15
./p16
greenlion commented 3 years ago

Fix for ALTER TABLE - https://github.com/greenlion/warp/commit/a7eb5ae5585cc84d9cacfe5c180a792169ce1c0b

mysql> create table t1(c1 int) engine=innodb;
Query OK, 0 rows affected (0.02 sec)

mysql> insert into t1 values (1);
Query OK, 1 row affected (0.01 sec)

mysql> insert into t1 values (1);
Query OK, 1 row affected (0.00 sec)

mysql> select * from t1;
+------+
| c1   |
+------+
|    1 |
|    1 |
+------+
2 rows in set (0.01 sec)

mysql> alter table t1 engine=warp;
Query OK, 2 rows affected (0.07 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> select * from t1;
+------+
| c1   |
+------+
|    1 |
|    1 |
+------+
2 rows in set (0.00 sec)
greenlion commented 3 years ago

Similar fix for CREATE TABLE .. AS SELECT https://github.com/greenlion/warp/commit/98bcd2d9b9a908c788b4e9d76ee87d0fce7bde9d

greenlion commented 3 years ago

@s4sukan

I can not duplicate the INSERT .. SELECT issue in my environment. I suspect there is some data in your table, perhaps string data, that is preventing some rows from making it into the WARP table.

It would be helpful if you can identify the missing rows, and try to insert them manually and see if that works.

mysql> select count(*) From lineorder;
+----------+
| count(*) |
+----------+
|  6001215 |
+----------+
1 row in set (3.17 sec)
mysql> insert into lineorder_w select * from lineorder;
Query OK, 6001215 rows affected (1 min 38.48 sec)
Records: 6001215  Duplicates: 0  Warnings: 0
mysql> select count(*) From lineorder_w;
+----------+
| count(*) |
+----------+
|  6001215 |
+----------+
1 row in set (1.75 sec)
s4sukan commented 3 years ago

Sure, uploaded the employees table which is in InnoDB.

  1. Create warp engine table
    CREATE TABLE `employees` (
    `id` int NOT NULL,
    `fname` varchar(30) DEFAULT NULL,
    `lname` varchar(30) DEFAULT NULL,
    `hired` date NOT NULL DEFAULT '1970-01-01',
    `separated` date DEFAULT NULL,
    `job_code` char(3) NOT NULL,
    `store_id` int NOT NULL,
    PRIMARY KEY (`id`)
    ) ENGINE=WARP;
  2. Insert data into warp table
    mysql> insert into warp.employees select * from analytics.employees;
    Query OK, 899515 rows affected (5.35 sec)
    Records: 899515  Duplicates: 0  Warnings: 0
  3. Select the data
    mysql> select count(*) from warp.employees;
    +----------+
    | count(*) |
    +----------+
    |   398820 |
    +----------+
    1 row in set (0.35 sec)
    mysql> select id from warp.employees into outfile '/tmp/qid';
    Query OK, 398820 rows affected (0.27 sec)

employees.sql.gz

greenlion commented 3 years ago

Hmm. It worked for me.

mysql> create database warp;
Query OK, 1 row affected (0.01 sec)

mysql> use warp;
Database changed
mysql> CREATE TABLE `employees` (   `id` int NOT NULL,   `fname` varchar(30) DEFAULT NULL,   `lname` varchar(30) DEFAULT NULL,   `hired` date NOT NULL DEFAULT '1970-01-01',   `separated` date DEFAULT NULL,   `job_code` char(3) NOT NULL,   `store_id` int NOT NULL,   PRIMARY KEY (`id`) ) ENGINE=WARP;
Query OK, 0 rows affected (0.01 sec)

mysql> insert into employees select * from test.employees;
Query OK, 899515 rows affected (9.17 sec)
Records: 899515  Duplicates: 0  Warnings: 0

mysql> select count(*) from employees;
+----------+
| count(*) |
+----------+
|   899515 |
+----------+
1 row in set (0.28 sec)
greenlion commented 3 years ago

@s4sukan What directives do you have in your my.cnf?

s4sukan commented 3 years ago
[root@mylab sukan]# less /etc/my.cnf
[root@mylab sukan]# cat /etc/my.cnf
[mysqld]
datadir=/var/lib/mysql
innodb_log_file_size=256M
innodb_buffer_pool_size=4G
socket=/tmp/mysql.sock
port=3306
secure_file_priv=""
local_infile=1
user=mysql

[client]
socket=/tmp/mysql.sock
[root@mylab sukan]# mysql -u root -e"select @@version;"
+-----------+
| @@version |
+-----------+
| 8.0.21    |
+-----------+
[root@mylab sukan]# cat /etc/redhat-release
CentOS Linux release 8.2.2004 (Core)
greenlion commented 3 years ago

I investigated this on the problem machine. The delete bitmap was corrupted in some way, resulting in some rows being reported as being deleted when they were not. Waiting for some feedback in slack from @s4sukan regarding if the database crashed, or what possible commands (update/delete/etc) were done before the problem started happening. Will do further testing to try to reproduce locally.

greenlion commented 3 years ago

Was not delete bitmap corruption. Duplicate rowids were created for tables due to a bug in ::write_row. Added a variable to WARP global state to track generation count and regenerate when WARP_ROWID_BATCH_SIZE(hard coded for now) rows have been inserted.