pingcap / tidb-lightning

This repository has been moved to https://github.com/pingcap/br
Apache License 2.0
143 stars 66 forks source link

tidb-lightning alters the values of timestamp columns #562

Open ngocson2vn opened 3 years ago

ngocson2vn commented 3 years ago

Bug Report

  1. What did you do? If possible, provide a recipe for reproducing the error. Using the tidb-lightning tool to restore a full backup data.

    • In the full backup data, there is a table that has timestamp columns like this:

      CREATE TABLE `users` (
      `id` bigint(20) NOT NULL,
      `created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
      `updated_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
      ...
      )
    • Set all servers (TiDB, PD, TiKV, Lightning, ...)' timezone to UTC

    • In the script deploy/scripts/start_lightning.sh, set timezone to Asia/Tokyo

      
      #!/bin/bash
      set -e
      ulimit -n 1000000
      cd "/home/ec2-user/deploy" || exit 1
      mkdir -p status

    export RUST_BACKTRACE=1

    export TZ=Asia/Tokyo

    echo -n 'sync ... ' stat=$(time sync) echo ok echo $stat

    nohup ./bin/tidb-lightning -config ./conf/tidb-lightning.toml &> log/tidb_lightning_stderr.log &

    echo $! > "status/tidb-lightning.pid"

  1. What did you expect to see? The tidb-lightning tool should respect the original data (in the full backup), import the original data as it is.

  2. What did you see instead? The tidb-lightning tool has altered the values of timestamp columns. For examples,

    Original data (from the full backup):

    $ head -2 ./xxxxx.users.000000001.sql
    INSERT INTO `users` VALUES
    (123456789123456789,'2019-09-03 12:31:02','2019-09-03 12:35:18',...)

    Imported data:

    > select id, created_at, updated_at from users where id = 123456789123456789;
    +--------------------+---------------------+---------------------+
    | id                 | created_at          | updated_at          |
    +--------------------+---------------------+---------------------+
    | 123456789123456789 | 2019-09-03 03:31:02 | 2019-09-03 03:35:18 |
    +--------------------+---------------------+---------------------+

    So the tidb-lightning tool has altered the values of columns created_at and updated_at. The original values have been subtracted by 9 hours.

  3. Versions of the cluster

    • TiDB-Lightning version (run tidb-lightning -V):

      Release Version: v4.0.9
      Git Commit Hash: 56bc32daad19b9dff10104c55300292de959fde3
      Git Branch: heads/refs/tags/v4.0.9
      UTC Build Time: 2020-12-19 04:48:01
      Go Version: go version go1.13 linux/amd64
    • TiKV-Importer version (run tikv-importer -V)

      Didn't use
    • TiKV version (run tikv-server -V):

      TiKV
      Release Version:   4.0.10
      Edition:           Community
      Git Commit Hash:   2ea4e608509150f8110b16d6e8af39284ca6c93a
      Git Commit Branch: heads/refs/tags/v4.0.10
      UTC Build Time:    2021-01-15 03:16:35
      Rust Version:      rustc 1.42.0-nightly (0de96d37f 2019-12-19)
      Enable Features:   jemalloc mem-profiling portable sse protobuf-codec
      Profile:           dist_release
    • TiDB cluster version (execute SELECT tidb_version(); in a MySQL client):

      +---------------------+
      | version()           |
      +---------------------+
      | 5.7.25-TiDB-v4.0.10 |
      +---------------------+
    • Other interesting information (system version, hardware config, etc):

      > show variables like '%time_zone%';
      +------------------+--------+
      | Variable_name    | Value  |
      +------------------+--------+
      | system_time_zone | UTC    |
      | time_zone        | SYSTEM |
      +------------------+--------+
      
      $ cat /etc/os-release
      NAME="Amazon Linux"
      VERSION="2"
      ID="amzn"
      ID_LIKE="centos rhel fedora"
      VERSION_ID="2"
      PRETTY_NAME="Amazon Linux 2"
      ANSI_COLOR="0;33"
      CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
      HOME_URL="https://amazonlinux.com/"
  4. Operation logs

    • Please upload tidb-lightning.log for TiDB-Lightning if possible
    • Please upload tikv-importer.log from TiKV-Importer if possible
    • Other interesting logs
  5. Configuration of the cluster and the task

    • tidb-lightning.toml for TiDB-Lightning if possible
      
      # lightning Configuration

    [lightning] file = "/home/tidb/deploy/log/tidb_lightning.log" index-concurrency = 2 io-concurrency = 5 level = "info" max-backups = 14 max-days = 28 max-size = 128 pprof-port = 8289 table-concurrency = 6

    [checkpoint] enable = true schema = "tidb_lightning_checkpoint" driver = "file"

    [tikv-importer] backend = "local" sorted-kv-dir = "/home/tidb/deploy/sorted-kv-dir"

    [mydumper] data-source-dir = "/home/tidb/deploy/mydumper/scheduled-backup-20210120-044816" no-schema = false read-block-size = 65536

    [tidb] build-stats-concurrency = 20 checksum-table-concurrency = 16 distsql-scan-concurrency = 100 host = "TIDB_HOST" index-serial-scan-concurrency = 20 log-level = "error" password = "xxxxx" port = 4000 status-port = 10080 user = "root" pd-addr = "PD_HOST:2379"

    [post-restore] analyze = true checksum = true

    [cron] log-progress = "5m" switch-mode = "5m"

    
    - `inventory.ini` if deployed by Ansible
  6. Screenshot/exported-PDF of Grafana dashboard or metrics' graph in Prometheus for TiDB-Lightning if possible

3pointer commented 3 years ago

Two reasons cause this problem.

  1. Lightning and TiDB cluster have different TZ
  2. TiDB cluster's default variable time_zone is SYSTEM

Because we already set time_zone to SYSTEM in session, Lightning will use the local system TZ which is Asia/Tokyo.

To solve this problem quickly. you can set Lightning the same TZ with the TiDB cluster. Next, We can fix it by set Lightning session TZ to TiDB cluster's system_time_zone when TiDB cluster's time_zone is SYSTEM.

Thanks for the reports!