stoneatom / stonedb

StoneDB is an Open-Source MySQL HTAP and MySQL-Native DataBase for OLTP, Real-Time Analytics, a counterpart of MySQLHeatWave. (https://stonedb.io)
https://stonedb.io/
GNU General Public License v2.0
862 stars 139 forks source link

bug: a strange question, create table return error #1831

Open davidshiz opened 1 year ago

davidshiz commented 1 year ago

Have you read the Contributing Guidelines on issues?

Please confirm if bug report does NOT exists already ?

Describe the problem

Occasional problems, continuous attention

mysql> use test;
Database changed

mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| t5             |
| t6             |
| t7             |
| t8             |
+----------------+
4 rows in set (0.00 sec)

mysql> create table t1 (f1 varchar(5));
ERROR 6 (HY000): Directory /stonedb57/install/data/tianmu_data/25.0 already exists!
mysql> create table t1 (f1 varchar(5));
ERROR 1030 (HY000): Got error 1 from storage engine
mysql> create table t1 (f1 varchar(5));
ERROR 6 (HY000): Directory /stonedb57/install/data/tianmu_data/27.0 already exists!
mysql> create table t1 (f1 varchar(5));
ERROR 1030 (HY000): Got error 1 from storage engine

Expected behavior

No response

How To Reproduce

No response

Environment

root@ub01:~# /stonedb57/install/bin/mysqld --version
/stonedb57/install/bin/mysqld  Ver 5.7.36-StoneDB-v1.0.3 for Linux on x86_64 (build-)
build information as follow:
        Repository address: https://github.com/stoneatom/stonedb.git:stonedb-5.7-dev
        Branch name: stonedb-5.7-dev
        Last commit ID: 20abfc729
        Last commit time: Date:   Tue May 16 16:26:45 2023 +0800
        Build time: Date: Wed May 17 19:48:49 CST 2023

Are you interested in submitting a PR to solve the problem?

RingsC commented 1 year ago
TianmuTable::CreateNew (xxx) {
...
  for (size_t idx = 0; idx < no_attrs; idx++) {
    auto dir = Engine::GetNextDataDir();
    dir /= std::to_string(tid) + "." + std::to_string(idx);
    if (system::DoesFileExist(dir)) {
      throw common::DatabaseException("Directory " + dir.string() + " already exists!");
    }
    fs::create_directory(dir);
    auto lnk = column_path / std::to_string(idx);
    fs::create_symlink(dir, lnk);

    TianmuAttr::Create(lnk, opt->atis[idx], opt->pss, 0, auto_inc_value);
    // TIANMU_LOG(LogCtl_Level::INFO, "Column %zu at %s", idx, dir.c_str());
  }
...
}

From this line dir /= std::to_string(tid) + "." + std::to_string(idx);. and the error message. We got that the tid was 25 and 27, the idx was '0'.

The tid comes from here, uint32_t tid = eng->GetNextTableId();. and it persients to disk at tianmu_data_dir/tianmu.id.

uint32_t Engine::GetNextTableId() {
  static std::mutex seq_mtx;

  std::scoped_lock lk(seq_mtx);
  fs::path p = tianmu_data_dir / "tianmu.tid";
  if (!fs::exists(p)) {
    TIANMU_LOG(LogCtl_Level::INFO, "Creating table id file");
    std::ofstream seq_file(p.string());
    if (seq_file)
      seq_file << 0;
    if (!seq_file) {
      throw common::FileException("Failed to write to table id file");
    }
  }

  uint32_t seq;
  std::fstream seq_file(p.string());
  if (seq_file)
    seq_file >> seq;
  if (!seq_file) {
    throw common::FileException("Failed to read from table id file");
  }
  seq++;
  seq_file.seekg(0);
  seq_file << seq;
  if (!seq_file) {
    throw common::FileException("Failed to write to table id file");
  }

  return seq;
}
RingsC commented 1 year ago

@davidshiz use build with build type= RelWithDebInfo when meet similar bugs at next time and dump the call stack, or seek the help from dev, it will big help to indentify the root cause.

RingsC commented 1 year ago

The operation of create table is not an atomic operation. If the instance crashed at middle of creation phase. It created directories did not removed. When we re-run create command again. It may to use the old table id to generate the table path. But, now the table path already exists. Therefore, the exception throws.

Engine::GetNextTableId() gets the next table id from tianmu.tid file.

│   790           seq++;                                                                                                                                                                                  │
│   791           seq_file.seekg(0);                                                                                                                                                                      │
│   792           seq_file << seq;                                                                                                                                                                        │
│  >793           seq_file.flush();  // sync to disk mandatory.                                                                                                                                           │
│   794           if (!seq_file) {                                                                                                                                                                        │
│   795             throw common::FileException("Failed to write to table id file");

(gdb) p seq
$2 = 51
(gdb) n
(gdb) 

Here, we check tianmu.tid file, and content in this file is still 50. but after flush, we get the new value, 51. If the instance crash at TIANMU_CREATE_TABLE_PHASE5, the files have been created, but TID in tianmu.tid may lost the new value because it did not flush the new value to disk.

Another place we should pay attention to: Engine::DeleteTable. When we delete a table, all the files and directories should be removed from disk. If instance crash at middle of deleting operation, some directories or files still exists on disk. At next time, create table command will maybe failed if the server gets the same TableID from GetNextTableID(), and try to create same files or directories.

RingsC commented 1 year ago

Create table, Delete Table, Rename Table, etc, all these opers are not atomic opers. so that data will show inconsistency at instance restarted.

ERROR 1030 (HY000): Got error 1 from storage engine
mysql> create table t4 (a int ) engine = tianmu;