stoneatom / stonedb

StoneDB is an Open-Source MySQL HTAP and MySQL-Native DataBase for OLTP, Real-Time Analytics, a counterpart of MySQLHeatWave. (https://stonedb.io)
https://stonedb.io/
GNU General Public License v2.0
862 stars 139 forks source link

bug: create table.....ignore select...。Query result error #232

Closed shangyanwen closed 2 years ago

shangyanwen commented 2 years ago

Describe the problem

create table if not exists t1 (a int unique, b int) ignore select 1 as a, 1 as b union select 1 as a, 2 as b;
select * from t1;
+------+------+
| a    | b    |
+------+------+
|    1 |    1 |
|    1 |    2 |
+------+------+
2 rows in set (0.00 sec)

Expected behavior

select * from t1;
+------+------+
| a    | b    |
+------+------+
|    1 |    1 |
+------+------+
1 row in set (0.00 sec)

note:The above is the result of innoDB validation test

How To Reproduce

create table if not exists t1 (a int unique, b int) ignore select 1 as a, 1 as b union select 1 as a, 2 as b;
select * from t1;

Environment

  1. StoneDB for mysql5.7 (release)
  2. Ubuntu 20.04.4

Are you interested in submitting a PR to solve the problem?

hustjieke commented 2 years ago

In stonedb:

mysql> use test ;
Database changed
mysql> create table if not exists t2 (a int unique, b int) engine=stonedb ignore select 1 as a, 1 as b union select 1 as a, 2 as b;
Query OK, 2 rows affected, 1 warning (0.04 sec)
Records: 2  Duplicates: 0  Warnings: 1
mysql> show warnings\G
*************************** 1. row ***************************
  Level: Note
   Code: 1592
Message: Unsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEMENT. CREATE... IGNORE SELECT is unsafe because the order in which rows are retrieved by the SELECT determines which (if any) rows are ignored. This order cannot be predicted and may differ on master and the slave.
1 row in set (0.00 sec)

In innodb:

*************************** 1. row ***************************
  Level: Warning
   Code: 1062
Message: Duplicate entry '1' for key 'a'
*************************** 2. row ***************************
  Level: Note
   Code: 1592
Message: Unsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEMENT. CREATE... IGNORE SELECT is unsafe because the order in which rows are retrieved by the SELECT determines which (if any) rows are ignored. This order cannot be predicted and may differ on master and the slave.
2 rows in set (0.00 sec)
hustjieke commented 2 years ago

Test in stonedb with stonedb_insert_delayed = 0/1;

hustjieke commented 2 years ago
│   705           if (tab) {                                                                                                                                                                                       │
│   706             std::vector<std::string_view> fields;                                                                                                                                                          │
│   707             std::vector<uint> cols = tab->KeyCols();                                                                                                                                                       │
│   708             for (auto &col : cols) {                                                                                                                                                                       │
│   709               fields.emplace_back(vcs[col].GetDataBytesPointer(0), vcs[col].Size(0));                                                                                                                      │
│   710             }                                                                                                                                                                                              │
│   711                                                                                                                                                                                                            │
│   712             if (tab->InsertIndex(current_tx, fields, NoObj()) == common::ErrorCode::DUPP_KEY) {                                                                                                            │
│   713               STONEDB_LOG(LogCtl_Level::INFO, "Insert duplicate key on row %d", NoObj() - 1);                                                                                                              │
│   714               return HA_ERR_FOUND_DUPP_KEY;                                                                                                                                                                │
│   715             }                                                                                                                                                                                              │
│   716           }                                                                                                                                                                                                │
│  >717           for (uint i = 0; i < NoAttrs(); i++) {                                                                                                                                                           │
│   718             m_attrs[i]->LoadData(&vcs[i]);                                                                                                                                                                 │
│   719           }   

When do create table t (a int primary key ...., tabl is not nil. When do create table (a int unique ..., tab is nil. I think sth wrong happend here.

hustjieke commented 2 years ago

After gdb, locate code in rc_table_index:

│   130         common::ErrorCode RCTableIndex::CreateIndexTable(const std::string &name, TABLE *table) {                                                                                                          │
│   131           std::string str;                                                                                                                                                                                 │
│   132           if (!NormalizeName(name, str)) {                                                                                                                                                                 │
│   133             throw common::Exception("Normalization wrong of table  " + name);                                                                                                                              │
│   134           }                                                                                                                                                                                                │
│   135                                                                                                                                                                                                            │
│  >136           if (table->s->keys > 1) {                                                                                                                                                                        │
│   137             STONEDB_LOG(LogCtl_Level::WARN, "Table :%s have other keys except primary key, only use primary key!", name.data());                                                                           │
│   138           } 

Only support primary key?

hustjieke commented 2 years ago

https://mariadb.com/kb/en/uniqueness-within-a-columnstore-table

hustjieke commented 2 years ago

Currently we do'not support unique key, only support primary key index like mariadb. Reference from https://mariadb.com/kb/en/uniqueness-within-a-columnstore-table:

ColumnStore like many other analytical database engines does not support unique constraints. This helps with performance and scaling out to much larger volumes than innodb supports. It is assumed that your data preparation / ETL phase will ensure correct data being fed into columnstore.

One common pattern is to use an innodb staging table with unique constraints to allow for elimnation of duplicates there then use non transactional insert into select into your columnstore table.

ping @shangyanwen

shangyanwen commented 2 years ago

Closure at the recommendation of R&D,we do'not support unique key