yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
https://www.yugabyte.com
Other
8.99k stars 1.07k forks source link

Support running yugabyte-db on ZFS #4157

Open dhbaird opened 4 years ago

dhbaird commented 4 years ago

Jira Link: DB-1633 This ticket is more informational than an actual issue, because the underlying issue seemed to be a consequence of running on ZFS (zfsonlinux 0.8.3). And I should know better than that; But I wanted to document my experience in case it is helpful. I was able to easily workaround this issue by switching to another filesystem (ext4, though it should be noted that xfs is also recommended).

Yugabyte version: yugabyte-2.1.2.0

Without further ado: I was doing some very simple stress resiliency stress testing of trying to start yugabyte, and then kill it, and see how it reacts, of as follows:

Step 1: Install

wget https://downloads.yugabyte.com/yugabyte-2.1.2.0-linux.tar.gz
tar xzvf yugabyte-2.1.2.0-linux.tar.gz
cd yugabyte-2.1.2.0/
./bin/post_install.sh

Step 2: Try to break something (expectation: this shouldn't actually break anything):

./bin/yb-ctl --rf 3 create
./bin/yb-ctl status

kill -9 PID1 PID2  # the yb-tserver and yb-master pids for node 1

./bin/yb-ctl start_node 1 --master
./bin/yb-ctl start_node 1

I ran into trouble almost immediately:

$ ./bin/yb-ctl --rf 3 create
Creating cluster.
Waiting for cluster to be ready.
----------------------------------------------------------------------------------------------------
| Node Count: 3 | Replication Factor: 3                                                            |
----------------------------------------------------------------------------------------------------
| JDBC                : jdbc:postgresql://127.0.0.1:5433/postgres                                  |
| YSQL Shell          : bin/ysqlsh                                                                 |
| YCQL Shell          : bin/cqlsh                                                                  |
| YEDIS Shell         : bin/redis-cli                                                              |
| Web UI              : http://127.0.0.1:7000/                                                     |
| Cluster Data        : /home/person/yugabyte-data                                                 |
----------------------------------------------------------------------------------------------------

For more info, please use: yb-ctl status
$ ./bin/yb-ctl status
----------------------------------------------------------------------------------------------------
| Node Count: 3 | Replication Factor: 3                                                            |
----------------------------------------------------------------------------------------------------
| JDBC                : jdbc:postgresql://127.0.0.1:5433/postgres                                  |
| YSQL Shell          : bin/ysqlsh                                                                 |
| YCQL Shell          : bin/cqlsh                                                                  |
| YEDIS Shell         : bin/redis-cli                                                              |
| Web UI              : http://127.0.0.1:7000/                                                     |
| Cluster Data        : /home/person/yugabyte-data                                                 |
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
| Node 1: yb-tserver (pid 3017), yb-master (pid 3008)                                              |
----------------------------------------------------------------------------------------------------
| JDBC                : jdbc:postgresql://127.0.0.1:5433/postgres                                  |
| YSQL Shell          : bin/ysqlsh                                                                 |
| YCQL Shell          : bin/cqlsh                                                                  |
| YEDIS Shell         : bin/redis-cli                                                              |
| data-dir[0]         : /home/person/yugabyte-data/node-1/disk-1/yb-data                           |
| yb-tserver Logs     : /home/person/yugabyte-data/node-1/disk-1/yb-data/tserver/logs              |
| yb-master Logs      : /home/person/yugabyte-data/node-1/disk-1/yb-data/master/logs               |
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
| Node 2: yb-tserver (pid 3020), yb-master (pid 3011)                                              |
----------------------------------------------------------------------------------------------------
| JDBC                : jdbc:postgresql://127.0.0.2:5433/postgres                                  |
| YSQL Shell          : bin/ysqlsh -h 127.0.0.2                                                    |
| YCQL Shell          : bin/cqlsh 127.0.0.2                                                        |
| YEDIS Shell         : bin/redis-cli -h 127.0.0.2                                                 |
| data-dir[0]         : /home/person/yugabyte-data/node-2/disk-1/yb-data                           |
| yb-tserver Logs     : /home/person/yugabyte-data/node-2/disk-1/yb-data/tserver/logs              |
| yb-master Logs      : /home/person/yugabyte-data/node-2/disk-1/yb-data/master/logs               |
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
| Node 3: yb-tserver (pid 3029), yb-master (pid 3014)                                              |
----------------------------------------------------------------------------------------------------
| JDBC                : jdbc:postgresql://127.0.0.3:5433/postgres                                  |
| YSQL Shell          : bin/ysqlsh -h 127.0.0.3                                                    |
| YCQL Shell          : bin/cqlsh 127.0.0.3                                                        |
| YEDIS Shell         : bin/redis-cli -h 127.0.0.3                                                 |
| data-dir[0]         : /home/person/yugabyte-data/node-3/disk-1/yb-data                           |
| yb-tserver Logs     : /home/person/yugabyte-data/node-3/disk-1/yb-data/tserver/logs              |
| yb-master Logs      : /home/person/yugabyte-data/node-3/disk-1/yb-data/master/logs               |
----------------------------------------------------------------------------------------------------

(wait for a few minutes)

Then, kill node 1 processes and see what happens:

kill 3017 3008 Now I can't restart node.

$ ./bin/yb-ctl start_node 1
Starting node tserver-1.
Waiting for cluster to be ready.
Viewing file /home/person/yugabyte-data/node-1/disk-1/master.err:
*** Aborted at 1586188053 (unix time) try "date -d @1586188053" if you are using GNU date ***
PC: @     0x7fad8401a46d __GI___nanosleep
*** SIGTERM (@0x7d000000a92) received by PID 3008 (TID 0x7fad92dd2b40) from PID 2706; stack trace: ***
    @     0x7fad8401aba0 (unknown)
    @     0x7fad8401a46d __GI___nanosleep
    @     0x7fad8f46d95c  yb::server::TotalMemWatcher::MemoryMonitoringLoop()
    @           0x40861f  yb::master::MasterMain()
    @     0x7fad83689825 __libc_start_main
    @           0x407eb9 _start
    @                0x0 (unknown)
Traceback (most recent call last):
  File "./bin/yb-ctl", line 1969, in <module>
    control.run()
  File "./bin/yb-ctl", line 1946, in run
    self.args.func()
  File "./bin/yb-ctl", line 1804, in start_node_cmd_impl
    self.wait_for_cluster_or_raise()
  File "./bin/yb-ctl", line 1551, in wait_for_cluster_or_raise
    raise RuntimeError("Timed out waiting for a YugaByte DB cluster!")
RuntimeError: Timed out waiting for a YugaByte DB cluster!
Viewing file /tmp/tmpXeHd7g:
2020-04-06 12:38:23,524 INFO: Starting tserver-1 with:
/pool/HOME/person/Downloads/yugabyte-2.1.2.0/bin/yb-tserver --fs_data_dirs "/home/person/yugabyte-data/node-1/disk-1" --webserver_interface 127.0.0.1 --rpc_bind_addresses 127.0.0.1 --v 0 --version_file_json_path=/pool/HOME/person/Downloads/yugabyte-2.1.2.0 --webserver_doc_root "/pool/HOME/person/Downloads/yugabyte-2.1.2.0/www" --tserver_master_addrs=127.0.0.1:7100,127.0.0.2:7100,127.0.0.3:7100 --yb_num_shards_per_tserver=2 --redis_proxy_bind_address=127.0.0.1:6379 --cql_proxy_bind_address=127.0.0.1:9042 --local_ip_for_outbound_sockets=127.0.0.1 --use_cassandra_authentication=false --ysql_num_shards_per_tserver=2 --enable_ysql=true --pgsql_proxy_bind_address=127.0.0.1:5433 >"/home/person/yugabyte-data/node-1/disk-1/tserver.out" 2>"/home/person/yugabyte-data/node-1/disk-1/tserver.err" &
2020-04-06 12:38:23,528 INFO: Waiting for master and tserver processes to come up.
2020-04-06 12:38:34,790 INFO: PIDs found: {'tserver': [10775, 3020, 3029], 'master': [None, 3011, 3014]}
2020-04-06 12:38:34,790 ERROR: Failed waiting for master and tserver processes to come up.
^^^ Encountered errors ^^^

~/yugabyte-data/node-1/disk-1/yb-data/master/logs/yb-master.myhost.personlog.INFO.20200406-190809.29819

Log file created at: 2020/04/06 19:08:09
Running on machine: myhost
Application fingerprint: version 2.1.2.0 build 10 revision d16f8e36d68311b1d9130ea28cec9e338b9f870e build_type RELEASE built at 10 Mar 2020 06:23:36 UTC
Running duration (h:mm:ss): 0:00:00
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
I0406 19:08:09.102195 29819 master_main.cc:93] NumCPUs determined to be: 32
I0406 19:08:09.103031 29819 mem_tracker.cc:249] MemTracker: hard memory limit is 25.186874 GB
I0406 19:08:09.103061 29819 mem_tracker.cc:251] MemTracker: soft memory limit is 21.408842 GB
I0406 19:08:09.105435 29819 rpc_server.cc:86] yb::server::RpcServer created at 0xe420b0
I0406 19:08:09.105480 29819 master.cc:143] yb::master::Master created at 0x7ffe1bc8e3e0
I0406 19:08:09.105509 29819 master.cc:144] yb::master::TSManager created at 0xe54d90
I0406 19:08:09.105525 29819 master.cc:145] yb::master::CatalogManager created at 0x11b8000
I0406 19:08:09.105605 29819 master_main.cc:114] Initializing master server...
I0406 19:08:09.108346 29819 fs_manager.cc:249] Opened local filesystem: /pool/HOME/person/yugabyte-data/node-1/disk-1
uuid: "88ccd4fe51a24d28a698cb264bdf37b9"
format_stamp: "Formatted at 2020-04-06 14:56:42 on myhost"
I0406 19:08:09.108667 29819 server_base.cc:231] Auto setting FLAGS_num_reactor_threads to 16
I0406 19:08:09.112211 29819 master_main.cc:117] Starting Master server...
I0406 19:08:09.112241 29847 async_initializer.cc:73] Starting to init ybclient
I0406 19:08:09.112704 29819 service_pool.cc:148] yb.master.MasterBackupService: yb::rpc::ServicePoolImpl created at 0xe46900
I0406 19:08:09.113672 29819 service_pool.cc:148] yb.master.MasterService: yb::rpc::ServicePoolImpl created at 0xe46d80
I0406 19:08:09.113984 29819 service_pool.cc:148] yb.tserver.TabletServerService: yb::rpc::ServicePoolImpl created at 0xe478c0
I0406 19:08:09.114169 29819 service_pool.cc:148] yb.consensus.ConsensusService: yb::rpc::ServicePoolImpl created at 0xe47d40
I0406 19:08:09.114426 29819 service_pool.cc:148] yb.tserver.RemoteBootstrapService: yb::rpc::ServicePoolImpl created at 0x14aa000
I0406 19:08:09.114567 29819 webserver.cc:148] Starting webserver on 127.0.0.1:7000
I0406 19:08:09.114591 29819 webserver.cc:153] Document root: /pool/HOME/person/Downloads/yugabyte-2.1.2.0/www
I0406 19:08:09.139885 29819 webserver.cc:240] Webserver started. Bound to: http://127.0.0.1:7000/
I0406 19:08:09.140082 29819 service_pool.cc:148] yb.server.GenericService: yb::rpc::ServicePoolImpl created at 0x14aa240
I0406 19:08:09.140260 29819 rpc_server.cc:169] RPC server started. Bound to: 127.0.0.1:7100
I0406 19:08:09.140323 29819 server_base.cc:487] Using private ip address 127.0.0.1
I0406 19:08:09.140604 29853 sys_catalog.cc:194] Trying to load previous SysCatalogTable data from disk
I0406 19:08:09.143571 29847 async_initializer.cc:77] Successfully built ybclient
I0406 19:08:09.155972 29853 consensus_meta.cc:275] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9: Updating active role from UNKNOWN_ROLE to FOLLOWER. Consensus state: current_term: 1 leader_uuid: "" config { opid_index: -1 peers { permanent_uuid: "88ccd4fe51a24d28a698cb264bdf37b9" member_type: VOTER last_known_private_addr { host: "127.0.0.1" port: 7100 } cloud_info { placement_cloud: "cloud1" placement_region: "datacenter1" placement_zone: "rack1" } } peers { permanent_uuid: "a54016ad06304da79e0f2b78d77829bc" member_type: VOTER last_known_private_addr { host: "127.0.0.2" port: 7100 } cloud_info { placement_cloud: "cloud1" placement_region: "datacenter1" placement_zone: "rack1" } } peers { permanent_uuid: "34038ec8f2e6444cbde38c7e6f031995" member_type: VOTER last_known_private_addr { host: "127.0.0.3" port: 7100 } cloud_info { placement_cloud: "cloud1" placement_region: "datacenter1" placement_zone: "rack1" } } }, has_pending_config = 0
I0406 19:08:09.156085 29853 sys_catalog.cc:239] Configuring consensus for distributed operation...
I0406 19:08:09.156131 29853 server_base.cc:487] Using private ip address 127.0.0.1
I0406 19:08:09.156396 29853 tablet_peer.cc:1156] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9 [state=BOOTSTRAPPING]: Changed state from NOT_STARTED to BOOTSTRAPPING
I0406 19:08:09.156735 29853 consensus_meta.cc:275] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9: Updating active role from UNKNOWN_ROLE to FOLLOWER. Consensus state: current_term: 1 leader_uuid: "" config { opid_index: -1 peers { permanent_uuid: "88ccd4fe51a24d28a698cb264bdf37b9" member_type: VOTER last_known_private_addr { host: "127.0.0.1" port: 7100 } cloud_info { placement_cloud: "cloud1" placement_region: "datacenter1" placement_zone: "rack1" } } peers { permanent_uuid: "a54016ad06304da79e0f2b78d77829bc" member_type: VOTER last_known_private_addr { host: "127.0.0.2" port: 7100 } cloud_info { placement_cloud: "cloud1" placement_region: "datacenter1" placement_zone: "rack1" } } peers { permanent_uuid: "34038ec8f2e6444cbde38c7e6f031995" member_type: VOTER last_known_private_addr { host: "127.0.0.3" port: 7100 } cloud_info { placement_cloud: "cloud1" placement_region: "datacenter1" placement_zone: "rack1" } } }, has_pending_config = 0
I0406 19:08:09.156788 29853 tablet_bootstrap_if.cc:75] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9: Bootstrap starting.
I0406 19:08:09.157094 29853 tablet.cc:376] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9:  Schema version for  sys.catalog is 0
I0406 19:08:09.157390 29853 transaction_participant.cc:119] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9: Create
I0406 19:08:09.157519 29853 docdb_rocksdb_util.cc:272] Auto setting FLAGS_rocksdb_max_background_flushes to 4
I0406 19:08:09.157555 29853 docdb_rocksdb_util.cc:295] Auto setting FLAGS_rocksdb_max_background_compactions to 3
I0406 19:08:09.157569 29853 docdb_rocksdb_util.cc:306] Auto setting FLAGS_rocksdb_base_background_compactions to 3
I0406 19:08:09.157590 29853 docdb_rocksdb_util.cc:329] Auto setting FLAGS_priority_thread_pool_size to 14
I0406 19:08:09.157753 29853 tablet.cc:479] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9: Creating RocksDB database in dir /pool/HOME/person/yugabyte-data/node-1/disk-1/yb-data/master/data/rocksdb/table-sys.catalog.uuid/tablet-00000000000000000000000000000000
I0406 19:08:09.158574 29853 tablet.cc:605] Opening RocksDB at: /pool/HOME/person/yugabyte-data/node-1/disk-1/yb-data/master/data/rocksdb/table-sys.catalog.uuid/tablet-00000000000000000000000000000000
I0406 19:08:09.160840 29853 version_set.cc:2818] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9 [R]: Recovered from manifest file:/pool/HOME/person/yugabyte-data/node-1/disk-1/yb-data/master/data/rocksdb/table-sys.catalog.uuid/tablet-00000000000000000000000000000000/MANIFEST-000024 succeeded,manifest_file_number is 24, next_file_number is 26, last_sequence is 1125899908347196, log_number is 0,prev_log_number is 0,max_column_family is 0, flushed_values is 0x00000000016513b0 -> { op_id: 1.4 hybrid_time: { physical: 1586185003220248 } history_cutoff: <invalid> hybrid_time_filter: <invalid> }
I0406 19:08:09.160887 29853 version_set.cc:2826] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9 [R]: Column family [default] (ID 0), log number is 23
I0406 19:08:09.161088 29853 event_logger.cc:67] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9 [R]: EVENT_LOG_v1 {"time_micros": 1586200089161060, "job": 1, "event": "recovery_started", "log_files": [25]}
I0406 19:08:09.161142 29853 db_impl.cc:1501] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9 [R]: Recovering log #25 mode 0 skip-recovery 0
I0406 19:08:09.161207 29853 version_set.cc:2259] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9 [R]: Creating manifest 27
I0406 19:08:09.161491 29853 version_set.cc:3325] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9 [R]: Writing version edit: comparator: "leveldb.BytewiseComparator"

I0406 19:08:09.161828 29853 version_set.cc:3325] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9 [R]: Writing version edit: log_number: 23
new_files {
  level: 0
  number: 10
  total_file_size: 23914435
  base_file_size: 110376
  smallest {
    key: "H\200\000\000\001S000000010000300080000000000000af\000\000!J\200#\200\001\263\303\311x\026\312\200J\001\250X\002\000\000\000\004"
    seqno: 1125899906842625
    user_values {
      tag: 1
      data: "\200\001\263\303\311\324;C\200J"
    }
    user_values {
      tag: 10
      data: "H\200\000\000\001"
    }
    user_values {
      tag: 11
      data: "$"
    }
    user_values {
      tag: 12
      data: "$"
    }
    user_values {
      tag: 13
      data: "$"
    }
    user_values {
      tag: 14
      data: "$"
    }
  }
  largest {
    key: "y\016\340@\000\000\000\000\200\0000\000\000\001\000\000\000O\000\0000\252!K\215#\200\001\263\303\306\326\200\307\200>+\0018\365\026\000\000\000\004"
    seqno: 1125899908347196
    user_values {
      tag: 1
      data: "\200\001\263\303\306\326a\315\200?\253"
    }
    user_values {
      tag: 10
      data: "S\377\350\240Yf\327N\336\264\244\352,\016Jj\332\000\000"
    }
    user_values {
      tag: 11
      data: "Sysql-catalog-configuration\000\000"
    }
    user_values {
      tag: 12
      data: "Syes_or_no_check\000\000"
    }
    user_values {
      tag: 13
      data: "SS\377\350\240Yf\327N\336\264\244\352,\016Jj\332\000\001\000\001!\000\000"
    }
    user_values {
      tag: 14
      data: "SS\323\006E\217\024\177A\306\256\212\244\201\002\261\377\325\000\001\000\001!\000\000"
    }
  }
}
flushed_frontier {
  [type.googleapis.com/yb.docdb.ConsensusFrontierPB] {
    op_id {
      term: 1
      index: 4
    }
    hybrid_time: 6497013773190135808
    history_cutoff: 18446744073709551614
  }
}

I0406 19:08:09.162163 29853 event_logger.cc:67] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9 [R]: EVENT_LOG_v1 {"time_micros": 1586200089162159, "job": 1, "event": "recovery_finished"}
I0406 19:08:09.162447 29853 db_impl.cc:1154] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9 [R]: [JOB 2] Delete /pool/HOME/person/yugabyte-data/node-1/disk-1/yb-data/master/data/rocksdb/table-sys.catalog.uuid/tablet-00000000000000000000000000000000//MANIFEST-000024 type=4 #24 -- OK
I0406 19:08:09.162544 29853 db_impl.cc:1154] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9 [R]: [JOB 2] Delete /pool/HOME/person/yugabyte-data/node-1/disk-1/yb-data/master/data/rocksdb/table-sys.catalog.uuid/tablet-00000000000000000000000000000000//000025.log type=0 #25 -- OK
I0406 19:08:09.164104 29853 tablet.cc:620] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9: Opening intents DB at: /pool/HOME/person/yugabyte-data/node-1/disk-1/yb-data/master/data/rocksdb/table-sys.catalog.uuid/tablet-00000000000000000000000000000000.intents
I0406 19:08:09.164360 29853 version_set.cc:2818] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9 [I]: Recovered from manifest file:/pool/HOME/person/yugabyte-data/node-1/disk-1/yb-data/master/data/rocksdb/table-sys.catalog.uuid/tablet-00000000000000000000000000000000.intents/MANIFEST-000014 succeeded,manifest_file_number is 14, next_file_number is 16, last_sequence is 1125899906842624, log_number is 0,prev_log_number is 0,max_column_family is 0, flushed_values is 0x0000000001603b00 -> { op_id: 1.4 hybrid_time: { physical: 1586185003220248 } history_cutoff: <invalid> hybrid_time_filter: <invalid> }
I0406 19:08:09.164394 29853 version_set.cc:2826] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9 [I]: Column family [default] (ID 0), log number is 13
I0406 19:08:09.164461 29853 event_logger.cc:67] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9 [I]: EVENT_LOG_v1 {"time_micros": 1586200089164457, "job": 1, "event": "recovery_started", "log_files": [15]}
I0406 19:08:09.164495 29853 db_impl.cc:1501] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9 [I]: Recovering log #15 mode 0 skip-recovery 0
I0406 19:08:09.164525 29853 version_set.cc:2259] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9 [I]: Creating manifest 17
I0406 19:08:09.164651 29853 version_set.cc:3325] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9 [I]: Writing version edit: comparator: "leveldb.BytewiseComparator"

I0406 19:08:09.164727 29853 version_set.cc:3325] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9 [I]: Writing version edit: log_number: 13
flushed_frontier {
  [type.googleapis.com/yb.docdb.ConsensusFrontierPB] {
    op_id {
      term: 1
      index: 4
    }
    hybrid_time: 6497013773190135808
    history_cutoff: 18446744073709551614
  }
}

I0406 19:08:09.164943 29853 event_logger.cc:67] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9 [I]: EVENT_LOG_v1 {"time_micros": 1586200089164940, "job": 1, "event": "recovery_finished"}
I0406 19:08:09.165122 29853 db_impl.cc:1154] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9 [I]: [JOB 2] Delete /pool/HOME/person/yugabyte-data/node-1/disk-1/yb-data/master/data/rocksdb/table-sys.catalog.uuid/tablet-00000000000000000000000000000000.intents//MANIFEST-000014 type=4 #14 -- OK
I0406 19:08:09.165203 29853 db_impl.cc:1154] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9 [I]: [JOB 2] Delete /pool/HOME/person/yugabyte-data/node-1/disk-1/yb-data/master/data/rocksdb/table-sys.catalog.uuid/tablet-00000000000000000000000000000000.intents//000015.log type=0 #15 -- OK
I0406 19:08:09.166501 29853 tablet.cc:664] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9: Successfully opened a RocksDB database at /pool/HOME/person/yugabyte-data/node-1/disk-1/yb-data/master/data/rocksdb/table-sys.catalog.uuid/tablet-00000000000000000000000000000000, obj: 0xef3400
I0406 19:08:09.166543 29866 transaction_participant.cc:945] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9: LoadTransactions start
I0406 19:08:09.166561 29853 tablet_bootstrap.cc:431] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9: Time spent opening tablet: real 0.009s  user 0.008s sys 0.001s
I0406 19:08:09.166612 29866 transaction_participant.cc:984] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9: LoadTransactions done: loaded 0 transactions
I0406 19:08:09.166752 29853 tablet_bootstrap.cc:497] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9: Will attempt to recover log segment /pool/HOME/person/yugabyte-data/node-1/disk-1/yb-data/master/wals/table-sys.catalog.uuid/tablet-00000000000000000000000000000000/wal-000000002
I0406 19:08:09.166787 29853 tablet_bootstrap.cc:497] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9: Will attempt to recover log segment /pool/HOME/person/yugabyte-data/node-1/disk-1/yb-data/master/wals/table-sys.catalog.uuid/tablet-00000000000000000000000000000000/wal-000000001
I0406 19:08:09.166844 29853 tablet_bootstrap.cc:833] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9: Max persistent index in RocksDB's SSTables before bootstrap: term: 1 index: 4/term: 1 index: 4
I0406 19:08:09.166975 29853 log.cc:808] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9: Setting log wal retention time to 0 seconds
W0406 19:08:09.167327 29853 log.cc:1179] Error closing log: Illegal state (yb/consensus/log.cc:1012): Bad state for Close() 0
I0406 19:08:09.167734 29853 transaction_participant.cc:141] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9: Shutdown
I0406 19:08:09.167825 29853 db_impl.cc:641] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9 [I]: Shutting down RocksDB at: /pool/HOME/person/yugabyte-data/node-1/disk-1/yb-data/master/data/rocksdb/table-sys.catalog.uuid/tablet-00000000000000000000000000000000.intents
I0406 19:08:09.168032 29853 db_impl.cc:751] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9 [I]: Shutdown done
I0406 19:08:09.168081 29853 db_impl.cc:641] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9 [R]: Shutting down RocksDB at: /pool/HOME/person/yugabyte-data/node-1/disk-1/yb-data/master/data/rocksdb/table-sys.catalog.uuid/tablet-00000000000000000000000000000000
I0406 19:08:09.168210 29853 db_impl.cc:751] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9 [R]: Shutdown done
E0406 19:08:09.168334 29853 master.cc:241] Master@127.0.0.1:7100: Unable to init master catalog manager: IO error (yb/util/env_util.cc:156): Unable to initialize catalog manager: Failed to initialize sys tables async: Failed log replay. Reason: Failed to open new log: Unable to open readable log segment: /pool/HOME/person/yugabyte-data/node-1/disk-1/yb-data/master/wals/table-sys.catalog.uuid/tablet-00000000000000000000000000000000/wal-000000002: Unable to initialize segment: EOF trying to read 12 bytes at offset 0
E0406 19:08:09.168632 29819 main_util.cc:22] IO error (yb/util/env_util.cc:156): Unable to initialize catalog manager: Failed to initialize sys tables async: Failed log replay. Reason: Failed to open new log: Unable to open readable log segment: /pool/HOME/person/yugabyte-data/node-1/disk-1/yb-data/master/wals/table-sys.catalog.uuid/tablet-00000000000000000000000000000000/wal-000000002: Unable to initialize segment: EOF trying to read 12 bytes at offset 0
I0406 19:08:09.168830 29819 master.cc:284] Master@127.0.0.1:7100 shutting down...
I0406 19:08:09.191009 29819 tablet_peer.cc:380] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9 [state=BOOTSTRAPPING]: Initiating TabletPeer shutdown
I0406 19:08:09.191084 29819 tablet_peer.cc:397] T 00000000000000000000000000000000 P 88ccd4fe51a24d28a698cb264bdf37b9 [state=QUIESCING]: Started shutdown from state: BOOTSTRAPPING
I0406 19:08:09.191339 29819 master.cc:292] Master@127.0.0.1:7100 shutdown complete.
bmatican commented 4 years ago

Would be great to at least put a note in our docs about this, cc @ddorian @stevebang

Perhaps in some of the new core db best practices section?

ddorian commented 4 years ago

@bmatican Pull request for docs: https://github.com/yugabyte/yugabyte-db/pull/4159

ddorian commented 4 years ago

Docs are updated. Leaving open for feature request.