Open haitaoguan opened 1 year ago
The tianmu engine does not support transactions. If an instance crashes, either commit or rollback should occur instead of reporting an error.
The error message reports listed below on session 3
mysql> select * from ttt;
No connection. Trying to reconnect...
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/home/lihao/workshop/bin_ver1/tmp/mysql.sock' (111)
ERROR:
Can't connect to the server
mysql>
and then restart the instance and re-run query statement on session 2, then we get the following message:
mysql> select * from ttt;
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id: 2
Current database: test
ERROR 6 (HY000): An unknown system exception error caught.
mysql> select * from ttt;
The call stack of here listed.
#0 Tianmu::system::TianmuFile::OpenReadOnly (this=0x7f2da6db1f10, file="./test/ttt.tianmu/columns/0/DATA") at /home/lihao/workshop/stonedb-ver-1/storage/tianmu/system/file.cpp:67
#1 0x000055b3f133e02a in Tianmu::core::PackInt::PackInt (this=0x7f2a94925030, dpn=0x7f2a62008fa8, pc=..., s=0x7f2a94920d00) at /home/lihao/workshop/stonedb-ver-1/storage/tianmu/data/pack_int.cpp:37
#2 0x000055b3f1141e49 in __gnu_cxx::new_allocator<Tianmu::core::PackInt>::construct<Tianmu::core::PackInt<Tianmu::core::DPN*&, Tianmu::core::ObjectId<(Tianmu::core::COORD_TYPE)0, 3, Tianmu::core::object
_id_helper::empty> const&, Tianmu::core::ColumnShare*&> > (this=0x7f2da6db20ff, __p=0x7f2a94925030) at /usr/include/c++/9/ext/new_allocator.h:146
#3 0x000055b3f1140a86 in std::allocator_traits<std::allocator<Tianmu::core::PackInt> >::construct<Tianmu::core::PackInt<Tianmu::core::DPN*&, Tianmu::core::ObjectId<(Tianmu::core::COORD_TYPE)0, 3, Tianmu
::core::object_id_helper::empty> const&, Tianmu::core::ColumnShare*&> > (__a=..., __p=0x7f2a94925030) at /usr/include/c++/9/bits/alloc_traits.h:483
#4 0x000055b3f113e329 in std::_Sp_counted_ptr_inplace<Tianmu::core::PackInt, std::allocator<Tianmu::core::PackInt>, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<Tianmu::core::DPN*&, Tianmu::core
::ObjectId<(Tianmu::core::COORD_TYPE)0, 3, Tianmu::core::object_id_helper::empty> const&, Tianmu::core::ColumnShare*&> (this=0x7f2a94925020, __a=...) at /usr/include/c++/9/bits/shared_ptr_base.h:548
#5 0x000055b3f113ad72 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<Tianmu::core::PackInt, std::allocator<Tianmu::core::PackInt>, Tianmu::core::DPN*&, Tianmu::core::ObjectId<(Tianmu
::core::COORD_TYPE)0, 3, Tianmu::core::object_id_helper::empty> const&, Tianmu::core::ColumnShare*&> (this=0x7f2da6db2318, __p=@0x7f2da6db2310: 0x0, __a=...)
at /usr/include/c++/9/bits/shared_ptr_base.h:679
#6 0x000055b3f11375da in std::__shared_ptr<Tianmu::core::PackInt, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<Tianmu::core::PackInt>, Tianmu::core::DPN*&, Tianmu::core::ObjectId<(Tianmu::co
re::COORD_TYPE)0, 3, Tianmu::core::object_id_helper::empty> const&, Tianmu::core::ColumnShare*&> (this=0x7f2da6db2310, __tag=...) at /usr/include/c++/9/bits/shared_ptr_base.h:1344
#7 0x000055b3f1134fa5 in std::shared_ptr<Tianmu::core::PackInt>::shared_ptr<std::allocator<Tianmu::core::PackInt>, Tianmu::core::DPN*&, Tianmu::core::ObjectId<(Tianmu::core::COORD_TYPE)0, 3, Tianmu::cor
e::object_id_helper::empty> const&, Tianmu::core::ColumnShare*&> (this=0x7f2da6db2310, __tag=...) at /usr/include/c++/9/bits/shared_ptr.h:359
#8 0x000055b3f11329f9 in std::allocate_shared<Tianmu::core::PackInt, std::allocator<Tianmu::core::PackInt>, Tianmu::core::DPN*&, Tianmu::core::ObjectId<(Tianmu
#9 0x0000555fabed8e82 in Tianmu::core::TianmuAttr::Fetch (this=0x7fe73401ee00, pc=...) at /home/lihao/workshop/stonedb-ver-1/storage/tianmu/vc/tianmu_attr.cpp:907
#10 0x0000555fabee5cb7 in Tianmu::core::DataCache::GetOrFetchObject<Tianmu::core::Pack, Tianmu::core::ObjectId<(Tianmu::core::COORD_TYPE)0, 3, Tianmu::core::object_id_helper::empty>, Tianmu::core::Tianmu
Attr> (this=0x555fb1209e20, coord_=..., fetcher_=0x7fe73401ee00) at /home/lihao/workshop/stonedb-ver-1/storage/tianmu/core/data_cache.h:234
#11 0x0000555fabed8247 in Tianmu::core::TianmuAttr::LockPackForUse (this=0x7fe73401ee00, pn=0) at /home/lihao/workshop/stonedb-ver-1/storage/tianmu/vc/tianmu_attr.cpp:824
#12 0x0000555fabdce56e in Tianmu::core::TianmuTable::LockPackForUse (this=0x7fe73401e150, attr=0, pack_no=0) at /home/lihao/workshop/stonedb-ver-1/storage/tianmu/core/tianmu_table.cpp:512
#13 0x0000555fac0ed242 in Tianmu::core::VCPackGuardian::LockPackrowOnLockOneByThread (this=0x7fe734020588, mit=...) at /home/lihao/workshop/stonedb-ver-1/storage/tianmu/data/pack_guardian.cpp:121
#14 0x0000555fac0ecda6 in Tianmu::core::VCPackGuardian::LockPackrow (this=0x7fe734020588, mit=...) at /home/lihao/workshop/stonedb-ver-1/storage/tianmu/data/pack_guardian.cpp:63
#15 0x0000555fabd5ba08 in Tianmu::vcolumn::VirtualColumn::LockSourcePacks (this=0x7fe7340204b0, mit=...) at /home/lihao/workshop/stonedb-ver-1/storage/tianmu/vc/virtual_column.h:45
#16 0x0000555fac1a0845 in Tianmu::core::ParameterizedFilter::FilterDeletedByTable (this=0x7fe7348fb7c0, rcTable=0x7fe73401e150, no_dims=@0x7fea11ccda50: 0, tableIndex=0)
at /home/lihao/workshop/stonedb-ver-1/storage/tianmu/core/parameterized_filter.cpp:1710
#17 0x0000555fac1a0ac5 in Tianmu::core::ParameterizedFilter::FilterDeletedForSelectAll (this=0x7fe7348fb7c0) at /home/lihao/workshop/stonedb-ver-1/storage/tianmu/core/parameterized_filter.cpp:1737
#18 0x0000555fac19d4ff in Tianmu::core::ParameterizedFilter::UpdateMultiIndex (this=0x7fe7348fb7c0, count_only=false, limit=-1)
at /home/lihao/workshop/stonedb-ver-1/storage/tianmu/core/parameterized_filter.cpp:1121
#19 0x0000555fabd50d05 in Tianmu::core::Query::Preexecute (this=0x7fea11cce700, qu=..., sender=0x7fe734020420, display_now=true) at /home/lihao/workshop/stonedb-ver-1/storage/tianmu/core/query.cpp:797
(gdb) p file
$12 = "./test/ttt.tianmu/columns/0/DATA"
But when the dir is listed on data
directory, the path of file
, ./test/ttt.tianmu/columns/0/DATA
, cannot be found on my server.
xxx@ubuntu:~/workshop/bin_ver1/data/test/ttt.tianmu/columns/0$ ls
DN filters META v
Therefore, in this function, the directory can be opened, and an exception thrown.
│ 66 int TianmuFile::OpenReadOnly(std::string const &file) { │
│ >67 return Open(file, O_RDONLY | O_LARGEFILE | O_BINARY, tianmu_umask); │
│ 68 }
The root cause is: after the instance was killed in begin
statement, the data
directory of that table was delete from disk. but the meta inforation about this table does not removed. the in-consistency is between data and its meta-data.
Firstly, we catch the execption, and report the error message in detail. next stage, we fix up the in-consistency between data and its meta-data.
mysql> select * from ttt;
ERROR 1 (HY000): An Tianmu Error system exception error caught. ErrorCode: 2 - No such file or directory[./test/ttt.tianmu/columns/0/DATA]
mysql>
There're some places to create data
directory, which is used to save the data into this directory.
In insertion phase, there are some types of LoadSource
. which determine whether the data save to file or not immediately.
For example:
TianmuAttr::LoadData () {
xxx
DPN &dpn = get_dpn(pi);
if (current_txn_->LoadSource() == common::LoadSource::LS_File || dpn.numOfRecords == (1U << pss)) {
Pack *pack = get_pack(pi);
if (!dpn.Trivial()) {
**pack->Save();**
}
if (pack) {
pack->Unlock();
}
core::Engine *eng = reinterpret_cast<core::Engine *>(tianmu_hton->data);
assert(eng);
eng->cache.DropObject(get_pc(pi));
dpn.SetRefCount(0);
}
}
pack->Save()
saves the data inot my_path/DATA
. If it does not write the data into file, but saves to memory. it would not create the corresponding directory after it was killed. Therefore, a directory can not be found exception be thrown.
This issue will be only occured with tianmu_insert_delayed=0
. With this configuration. the data will be inserted directly into memory, not save to DATA
file.
int Engine::InsertRow(const std::string &table_path, [[maybe_unused]] Transaction *trans_, TABLE *table,
std::shared_ptr<TableShare> &share) {
int ret = 0;
try {
if (tianmu_sysvar_insert_delayed && table->s->tmp_table == NO_TMP_TABLE) {
if (tianmu_sysvar_enable_rowstore) {
ret = InsertToDelta(table_path, share, table);
} else {
InsertDelayed(table_path, table);
}
tianmu_stat.delta_insert++;
} else {
current_txn_->SetLoadSource(common::LoadSource::LS_Direct); //insert directly with tianmu_insert_delay=0
auto rct = current_txn_->GetTableByPath(table_path);
ret = rct->Insert(table);
}
return ret;
it sets the load source to direct
, in this mode in TianmuAttr::LoadData
, it would not call pack->Save()
to save the data to disk. Therefore, after the instance was killed, all the data lost, and DATA
file was not created at that moment.
xxx@ubuntu:~/workshop/bin_ver1/data/test/ttt.tianmu/columns/0$ ls
DN filters META v
it does not contains any file named with DATA
in this directory.
The behavior of transaction does not work.
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> insert into ttt values(1,'AAA');
Query OK, 1 row affected (0.01 sec)
mysql> insert into ttt values(2,'BBB');
Query OK, 1 row affected (0.00 sec)
mysql> select * from ttt;
+------+------+
| id | name |
+------+------+
| 1 | AAA |
| 2 | BBB |
+------+------+
2 rows in set (0.00 sec)
mysql> rollback;
Query OK, 0 rows affected (0.00 sec)
mysql> quit
Bye
xxx@ubuntu:~/workshop/bin_ver1$ ./bin/mysql -uroot -p123456
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 5
Server version: 5.7.36-StoneDB-v1.0.3.2579bd4aa build-
Copyright (c) 2021, 2022 StoneAtom Group Holding Limited
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> select * from ttt;
+------+------+
| id | name |
+------+------+
| 1 | AAA |
| 2 | BBB |
+------+------+
2 rows in set (0.00 sec)
From the text above, after the rollback
commaned executed, the data still remain existed here.
Root cause:
If we set the params of tianmu_insert_delayed=0
, it will not write to DATA
but write into memory. That is a obsoleted behavior. now that, all the data will write into row store.
Now, the solution: Write all the data to DATA immediately.
After that, it acts like below.
mysql> use test;
No connection. Trying to reconnect...
Connection id: 2
Current database: *** NONE ***
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> select * from ttt;
+------+------+
| id | name |
+------+------+
| 1 | AAA |
| 2 | BBB |
| 2 | BBB |
+------+------+
3 rows in set (0.00 sec)
we cannot assure that the atomic of writting processing. if failed in wirtting a large amount of data into DATA and sync to disk, it may lead data inconsistency.
In PR #1841, it can solve this unexpected exception, but it will raise the disk space unexpected useage in #1845. Compared with these two issue priorities, we firstly, revert PR#1841.
Have you read the Contributing Guidelines on issues?
Please confirm if bug report does NOT exists already ?
Describe the problem
Expected behavior
No response
How To Reproduce
No response
Environment
./mysqld Ver 5.7.36-StoneDB-v1.0.3 for Linux on x86_64 (build-) build information as follow: Repository address: https://github.com/stoneatom/stonedb.git:stonedb-5.7-dev Branch name: stonedb-5.7-dev Last commit ID: 31919be Last commit time: Date: Thu Apr 20 10:19:54 2023 +0800 Build time: Date: Sun Apr 23 12:03:01 CST 2023
Are you interested in submitting a PR to solve the problem?