facebook / mysql-5.6

Facebook's branch of the Oracle MySQL database. This includes MyRocks.
http://myrocks.io
Other
2.49k stars 715 forks source link

MyRocks: 1 hour stall in DelayWrite() during bulk load #348

Closed RickPizzi closed 7 years ago

RickPizzi commented 8 years ago

Hi,

still trying to load our 830 tables stats server into MyRocks, we noticed a very long (> 1 hr) stall during the bulk load, when it was about to complete. Only 2 threads from myloader were left, but there has been no progress for more than one hour (SST files untouched, by looking at the files modification time).

I ran a pstack to check what was going on (output below). From what I can see, there were 2 threads trying to commit (do_bulk_commit) and a 3rd thread was a background thread trying to persist statistics (persist_stats). Out of the 2 committing thread, one was stuck in DelayWrite() and the other one waiting on a mutex.

After some more time everything magically resolved by itself (not sure if my pstack command was part of the black magic here).

Relevant output from pstack follows.

Thread 17 (Thread 0x7f9f735fe700 (LWP 30412)):

0 0x00007fa102d4768c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0

1 0x00007fa1015aa37c in std::condition_variable::wait(std::unique_lockstd::mutex&) () from /usr/local/lib64/libstdc++.so.6

2 0x0000000000ca221c in rocksdb::WriteThread::BlockingAwaitState(rocksdb::WriteThread::Writer*, unsigned char) ()

3 0x0000000000ca2348 in rocksdb::WriteThread::AwaitState(rocksdb::WriteThread::Writer, unsigned char, rocksdb::WriteThread::AdaptationContext) ()

4 0x0000000000ca28a4 in rocksdb::WriteThread::JoinBatchGroup(rocksdb::WriteThread::Writer*) ()

5 0x0000000000c0927f in rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, rocksdb::WriteBatch, rocksdb::WriteCallback, unsigned long*, unsigned long, bool) ()

6 0x0000000000c0b13b in rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, rocksdb::WriteBatch*) ()

7 0x0000000000baa4cd in myrocks::Rdb_dict_manager::commit(rocksdb::WriteBatch*, bool) ()

8 0x0000000000badc75 in myrocks::Rdb_ddl_manager::persist_stats(bool) ()

9 0x0000000000b8803c in myrocks::Rdb_background_thread::run() ()

10 0x0000000000bdef54 in myrocks::Rdb_thread::thread_func(void*) ()

11 0x00007fa102d43aa1 in start_thread () from /lib64/libpthread.so.0

12 0x00007fa100dacaad in clone () from /lib64/libc.so.6

Thread 7 (Thread 0x7fa0fc08f700 (LWP 6071)):

0 0x00007fa100d70bdd in nanosleep () from /lib64/libc.so.6

1 0x00007fa100da5d94 in usleep () from /lib64/libc.so.6

2 0x0000000000be895f in rocksdb::DBImpl::DelayWrite(unsigned long) ()

3 0x0000000000c0abfa in rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, rocksdb::WriteBatch, rocksdb::WriteCallback, unsigned long*, unsigned long, bool) ()

4 0x0000000000c0b13b in rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, rocksdb::WriteBatch*) ()

5 0x0000000000d37838 in rocksdb::TransactionImpl::Commit() ()

6 0x0000000000b9ec6b in myrocks::Rdb_transaction_impl::commit_no_binlog() ()

7 0x0000000000b8c25d in myrocks::ha_rocksdb::do_bulk_commit(myrocks::Rdb_transaction*) ()

8 0x0000000000b92fac in myrocks::ha_rocksdb::update_writerow(unsigned char const, unsigned char const_, bool) ()

9 0x0000000000b9301d in myrocks::ha_rocksdb::write_row(unsigned char*) ()

10 0x0000000000620334 in handler::ha_write_row(unsigned char*) ()

11 0x0000000000e47f98 in ha_partition::write_row(unsigned char*) ()

12 0x0000000000620334 in handler::ha_write_row(unsigned char*) ()

13 0x000000000076daad in writerecord(THD, TABLE_, COPYINFO, COPYINFO) ()

14 0x0000000000772f28 in mysqlinsert(THD, TABLELIST, List&, List<List >&, List&, List&, enum_duplicates, bool) ()

15 0x000000000078e420 in mysql_executecommand(THD, unsigned long long_, unsigned long long*) ()

16 0x0000000000793aef in mysqlparse(THD, char_, unsigned int, Parserstate, unsigned long long, char, std::__cxx11::basic_string<char, std::chartraits, std::allocator > const&, char) ()

17 0x0000000000795442 in dispatch_command(enum_servercommand, THD, char_, unsigned int) ()

18 0x0000000000753adc in do_handle_one_connection(THD*) ()

19 0x0000000000753d89 in handle_one_connection ()

20 0x00007fa102d43aa1 in start_thread () from /lib64/libpthread.so.0

21 0x00007fa100dacaad in clone () from /lib64/libc.so.6

Thread 2 (Thread 0x7fa0f42b5700 (LWP 6085)):

0 0x00007fa102d4768c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0

1 0x00007fa1015aa37c in std::condition_variable::wait(std::unique_lockstd::mutex&) () from /usr/local/lib64/libstdc++.so.6

2 0x0000000000ca221c in rocksdb::WriteThread::BlockingAwaitState(rocksdb::WriteThread::Writer*, unsigned char) ()

3 0x0000000000ca2348 in rocksdb::WriteThread::AwaitState(rocksdb::WriteThread::Writer, unsigned char, rocksdb::WriteThread::AdaptationContext) ()

4 0x0000000000ca28a4 in rocksdb::WriteThread::JoinBatchGroup(rocksdb::WriteThread::Writer*) ()

5 0x0000000000c0927f in rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, rocksdb::WriteBatch, rocksdb::WriteCallback, unsigned long*, unsigned long, bool) ()

6 0x0000000000c0b13b in rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, rocksdb::WriteBatch*) ()

7 0x0000000000d37838 in rocksdb::TransactionImpl::Commit() ()

8 0x0000000000b9ec6b in myrocks::Rdb_transaction_impl::commit_no_binlog() ()

9 0x0000000000b8c25d in myrocks::ha_rocksdb::do_bulk_commit(myrocks::Rdb_transaction*) ()

10 0x0000000000b92fac in myrocks::ha_rocksdb::update_writerow(unsigned char const, unsigned char const_, bool) ()

11 0x0000000000b9301d in myrocks::ha_rocksdb::write_row(unsigned char*) ()

12 0x0000000000620334 in handler::ha_write_row(unsigned char*) ()

13 0x0000000000e47f98 in ha_partition::write_row(unsigned char*) ()

14 0x0000000000620334 in handler::ha_write_row(unsigned char*) ()

15 0x000000000076daad in writerecord(THD, TABLE_, COPYINFO, COPYINFO) ()

16 0x0000000000772f28 in mysqlinsert(THD, TABLELIST, List&, List<List >&, List&, List&, enum_duplicates, bool) ()

17 0x000000000078e420 in mysql_executecommand(THD, unsigned long long_, unsigned long long*) ()

18 0x0000000000793aef in mysqlparse(THD, char_, unsigned int, Parserstate, unsigned long long, char, std::__cxx11::basic_string<char, std::chartraits, std::allocator > const&, char) ()

19 0x0000000000795442 in dispatch_command(enum_servercommand, THD, char_, unsigned int) ()

20 0x0000000000753adc in do_handle_one_connection(THD*) ()

21 0x0000000000753d89 in handle_one_connection ()

22 0x00007fa102d43aa1 in start_thread () from /lib64/libpthread.so.0

23 0x00007fa100dacaad in clone () from /lib64/libc.so.6

RickPizzi commented 8 years ago

This just happened again, about 20 minutes later, with identical stack traces

RickPizzi commented 8 years ago

Relevant config

rocksdb skip-innodb default-storage-engine=rocksdb default-tmp-storage-engine=MyISAM collation-server=latin1_bin binlog_format = ROW

rocksdb_max_open_files=-1 rocksdb_base_background_compactions=1 rocksdb_max_total_wal_size=4G rocksdb_block_size=16384 rocksdb_block_cache_size=16G rocksdb_table_cache_numshardbits=6 rocksdb_skip_unique_check=1 rocksdb_commit_in_the_middle=1 rocksdb_write_disable_wal=1 rocksdb_max_background_flushes=40 rocksdb_max_background_compactions=40 rocksdb_default_cf_options=write_buffer_size=128m;level0_file_num_compaction_trigger=4;level0_slowdown_writes_trigger=256;level0_stop_writes_trigger=256;max_write_buffer_number=16;memtable=vector:1024

RickPizzi commented 8 years ago

Can it perhaps be related to rocksdb_stats_dump_period_sec value? It is happening at regular intervals it seems. Mind you, MyRocks has 32100+ files open at this time (I opened a different issue about this)

yoshinorim commented 8 years ago

Could you paste "show global status like '%hash%';" output? We had RocksDB stall bugs at DelayWrite() before, but newer rocksdb revision fixed known bugs.

RickPizzi commented 8 years ago

Unfortunately my installation doesn't seem to have this info for some reason....

mysql> show global status like '%hash%'; +------------------+--------------------+ | Variable_name | Value | +------------------+--------------------+ | git_hash | @MYSQL_GIT_HASH@ | | rocksdb_git_hash | @ROCKSDB_GIT_HASH@ | +------------------+--------------------+ 2 rows in set (0.01 sec)

Perhaps I can fetch that same info from a file in .git folder?

jkedgar commented 8 years ago

The cmake process is supposed to assign values to MYSQL_GIT_HASH and ROCKSDB_GIT_HASH but that doesn't seem to have occurred in this case.

Yes - send the git hash from your repository and from the rocksdb submodule.

RickPizzi commented 8 years ago

Dunno why, I followed instructions found on your wiki page here.... anyways, git rev-parse HEAD gives: 86587affafe77ef555f7c3839839de44f0f203f3 I'm not git expert so if I need to run some other command please let me know.

mdcallag commented 8 years ago

The first commits from this would be useful cd $mysql-src; git log cd $mysql-src/rocksdb; git log

On Fri, Oct 21, 2016 at 11:06 AM, Rick Pizzi notifications@github.com wrote:

Unfortunately my installation doesn't seem to have this info for some reason....

mysql> show global status like '%hash%'; +------------------+--------------------+ | Variable_name | Value | +------------------+--------------------+ | git_hash | @MYSQL_GIT_HASH@ | | rocksdb_git_hash | @ROCKSDB_GIT_HASH@ | +------------------+--------------------+ 2 rows in set (0.01 sec)

Perhaps I can fetch that same info from a file in .git folder?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/facebook/mysql-5.6/issues/348#issuecomment-255435150, or mute the thread https://github.com/notifications/unsubscribe-auth/ABkKTQyXKQqXqHTeexHZclZO3xtAEzdVks5q2P8VgaJpZM4KdAAC .

Mark Callaghan mdcallag@gmail.com

RickPizzi commented 8 years ago

Mark: by �first commits", you mean more recent ones?

Thanks

Rick

Riccardo Pizzi pizzi@leopardus.com http://www.vecchiflipper.it http://www.pinballowners.com/webmaster

Dogs have masters. Cats have staff.

On 25/ott/2016, at 20:19, Mark Callaghan notifications@github.com wrote:

The first commits from this would be useful cd $mysql-src; git log cd $mysql-src/rocksdb; git log

On Fri, Oct 21, 2016 at 11:06 AM, Rick Pizzi notifications@github.com wrote:

Unfortunately my installation doesn't seem to have this info for some reason....

mysql> show global status like '%hash%'; +------------------+--------------------+ | Variable_name | Value | +------------------+--------------------+ | git_hash | @MYSQL_GIT_HASH@ | | rocksdb_git_hash | @ROCKSDB_GIT_HASH@ | +------------------+--------------------+ 2 rows in set (0.01 sec)

Perhaps I can fetch that same info from a file in .git folder?

� You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/facebook/mysql-5.6/issues/348#issuecomment-255435150, or mute the thread https://github.com/notifications/unsubscribe-auth/ABkKTQyXKQqXqHTeexHZclZO3xtAEzdVks5q2P8VgaJpZM4KdAAC .

Mark Callaghan mdcallag@gmail.com � You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

mdcallag commented 8 years ago

Please share the first (top) commit it shows.

On Tue, Oct 25, 2016 at 11:23 AM, Rick Pizzi notifications@github.com wrote:

Mark: by �first commits", you mean more recent ones?

Thanks

Rick

Riccardo Pizzi pizzi@leopardus.com http://www.vecchiflipper.it http://www.pinballowners.com/webmaster

Dogs have masters. Cats have staff.

On 25/ott/2016, at 20:19, Mark Callaghan notifications@github.com wrote:

The first commits from this would be useful cd $mysql-src; git log cd $mysql-src/rocksdb; git log

On Fri, Oct 21, 2016 at 11:06 AM, Rick Pizzi notifications@github.com wrote:

Unfortunately my installation doesn't seem to have this info for some reason....

mysql> show global status like '%hash%'; +------------------+--------------------+ | Variable_name | Value | +------------------+--------------------+ | git_hash | @MYSQL_GIT_HASH@ | | rocksdb_git_hash | @ROCKSDB_GIT_HASH@ | +------------------+--------------------+ 2 rows in set (0.01 sec)

Perhaps I can fetch that same info from a file in .git folder?

� You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/facebook/mysql-5.6/issues/348# issuecomment-255435150, or mute the thread https://github.com/notifications/unsubscribe-auth/ ABkKTQyXKQqXqHTeexHZclZO3xtAEzdVks5q2P8VgaJpZM4KdAAC .

Mark Callaghan mdcallag@gmail.com � You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/facebook/mysql-5.6/issues/348#issuecomment-256123341, or mute the thread https://github.com/notifications/unsubscribe-auth/ABkKTZh-3XG9j4uyatvgvjrVOVyV5xT6ks5q3kkGgaJpZM4KdAAC .

Mark Callaghan mdcallag@gmail.com

RickPizzi commented 8 years ago

Here you go.

pwd

/home/rpizzi/mysql-fb

git log | head -100

commit 86587affafe77ef555f7c3839839de44f0f203f3 Author: Tian Xia tianx@fb.com Date: Tue Oct 4 10:01:52 2016 -0700

Allow filtering of show commands through admission control

Summary:
Added SHOW command filter in admission control.

Squash with https://reviews.facebook.net/D55185
Squash with https://reviews.facebook.net/D48405

Test Plan: mtr

Reviewers: santoshb, jkedgar

Reviewed By: jkedgar

Subscribers: webscalesql-eng

Differential Revision: https://reviews.facebook.net/D64629

commit 8959b43b017dbb41c1504368b47a570828aa0a86 Author: Anirban Rahut arahut@fb.com was built with so we can skip tests that are known to fail without full regex support

Test Plan: Run rocksdb.collation on the regular build and see that it passes.  Then run it on the --gcc48 build and see that it gets skipped.

Reviewers: hermanlee4, mung, gunnarku

Reviewed By: mung

Subscribers: webscalesql-eng

Differential Revision: https://reviews.facebook.net/D64515

commit b0904cce900baca5a092035edd24048f77687798 Author: Alex Yang alexyang@fb.com Date: Fri Aug 5 13:29:40 2016 -0700

Add sstfilewriter api improvements to fast secondary index creation

Summary:
This diff utilizes the recently implemented SSTFileWriter API for fast secondary index creation.

Writing to SST files directly requires the records of each secondary key to be written in order, to do this we need to leverage an external sort as the index is often larger than what can be held memory.

We iterate through the primary key entries, writing the offsets of each entry into a balanced binary tree (std::set, which uses red-black tree under the hood).  The tree itself uses a custom comparator which does a byte-wise comparison of records during insertion, which it uses to maintain the ordering of the tree.

Once we have reached the limit of what our sort buffer can hold in memory (RDB_MERGE_BUF_SIZE), we write to a tmp file out to disk in sorted order by iterating through the sorted "offset tree".  We continue to do this until all entries in the primary key have been traversed.  At this point we have several sorted chunks in a tmp_file on disk.

pwd

/home/rpizzi/mysql-fb/rocksdb

git log | head -100

commit 6a17b07ca856e573fabd6345d70787d4e481f57b Author: Islam AbdelRahman tec@fb.com Date: Fri Aug 19 16:17:56 2016 -0700

Add TablePropertiesCollector support in SstFileWriter

Summary: Update SstFileWriter to use user TablePropertiesCollectors that are passed in Options

Test Plan: unittests

Reviewers: sdong

Reviewed By: sdong

Subscribers: jkedgar, andrewkr, hermanlee4, dhruba, yoshinorim

Differential Revision: https://reviews.facebook.net/D62253

  make all check -j32
  ROCKSDB_FBCODE_BUILD_WITH_481=1 make all check -j32
  ROCKSDB_NO_FBCODE=1 make all check -j32

  export TBB_BASE=/mnt/gvfs/third-party2/tbb/afa54b33cfcf93f1d90a3160cdb894d6d63d5dca/4.0_update2/gcc-4.9-glibc-2.20/e9936bf;
  ROCKSDB_NO_FBCODE=1 CFLAGS="-I $TBB_BASE/include" LDFLAGS="-L $TBB_BASE/lib -Wl,-rpath=$TBB_BASE/lib" make all check -j32

Reviewers: IslamAbdelRahman, sdong

Reviewed By: sdong

Subscribers: kradhakrishnan, yhchiang, IslamAbdelRahman, andrewkr, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D56979

commit 49d88be021bb83ddc4b8fe44dbf8f28fc00868b5 Author: Jay busyjaylee@gmail.com Date: Thu Aug 18 09:48:43 2016 +0800

c abi: allow compaction filter ignore snapshot (#1268)

close #1262

commit 0b63f51fbc4321bf67175be5c2b12f2884f23f90 Author: Alexander Jipa alexander.jipa@gmail.com Date: Wed Aug 17 19:59:22 2016 -0400

fixes 1215: execute_process(COMMAND mkdir ${DIR}) fails to create a directory with cmake on Windows (#1219)

CMake - Use a platform-neutral mkdir function

commit 3981345be191dc676833341bac99372dc7940b86

Riccardo Pizzi pizzi@leopardus.com http://www.vecchiflipper.it http://www.pinballowners.com/webmaster

Dogs have masters. Cats have staff.

On 25/ott/2016, at 20:29, Mark Callaghan notifications@github.com wrote:

Please share the first (top) commit it shows.

On Tue, Oct 25, 2016 at 11:23 AM, Rick Pizzi notifications@github.com wrote:

Mark: by �first commits", you mean more recent ones?

Thanks

Rick

Riccardo Pizzi pizzi@leopardus.com http://www.vecchiflipper.it http://www.pinballowners.com/webmaster

Dogs have masters. Cats have staff.

On 25/ott/2016, at 20:19, Mark Callaghan notifications@github.com wrote:

The first commits from this would be useful cd $mysql-src; git log cd $mysql-src/rocksdb; git log

On Fri, Oct 21, 2016 at 11:06 AM, Rick Pizzi notifications@github.com wrote:

Unfortunately my installation doesn't seem to have this info for some reason....

mysql> show global status like '%hash%'; +------------------+--------------------+ | Variable_name | Value | +------------------+--------------------+ | git_hash | @MYSQL_GIT_HASH@ | | rocksdb_git_hash | @ROCKSDB_GIT_HASH@ | +------------------+--------------------+ 2 rows in set (0.01 sec)

Perhaps I can fetch that same info from a file in .git folder?

� You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/facebook/mysql-5.6/issues/348# issuecomment-255435150, or mute the thread https://github.com/notifications/unsubscribe-auth/ ABkKTQyXKQqXqHTeexHZclZO3xtAEzdVks5q2P8VgaJpZM4KdAAC .

Mark Callaghan mdcallag@gmail.com � You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/facebook/mysql-5.6/issues/348#issuecomment-256123341, or mute the thread https://github.com/notifications/unsubscribe-auth/ABkKTZh-3XG9j4uyatvgvjrVOVyV5xT6ks5q3kkGgaJpZM4KdAAC .

Mark Callaghan mdcallag@gmail.com — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

yoshinorim commented 8 years ago

@RickPizzi We added new (much more efficient) data loading feature and fixed all known bugs. Could you please pull all changes (git pull --rebase), build, and try loading data again (See load optimizations at https://github.com/facebook/mysql-5.6/wiki/data-loading)? DelayWrite() stall is expected to disappear with our new data loading feature.

RickPizzi commented 8 years ago

Sure! Will do and report results back. Thank you!

Riccardo Pizzi pizzi@leopardus.com http://www.vecchiflipper.it http://www.pinballowners.com/webmaster

Dogs have masters. Cats have staff.

On 31/ott/2016, at 22:55, Yoshinori Matsunobu notifications@github.com wrote:

@RickPizzi We added new (much more efficient) data loading feature and fixed all known bugs. Could you please pull all changes (git pull --rebase), build, and try loading data again (See load optimizations at https://github.com/facebook/mysql-5.6/wiki/data-loading)? DelayWrite() stall is expected to disappear with our new data loading feature.

� You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

gunnarku commented 7 years ago

Please re-open if you still the issue.