pingcap / dm

Data Migration Platform
Apache License 2.0
456 stars 188 forks source link

database name cause mydumper not working #796

Closed elvizlai closed 4 years ago

elvizlai commented 4 years ago

Bug Report

Please answer these questions before submitting your issue. Thanks!

  1. What did you do? If possible, provide a recipe for reproducing the error.

For historical reason we had some database using - to connect.

Here is my task config:

---
name: test1
task-mode: all

target-database:
  host: "PRIVATE_IP"
  port: 4000
  user: "root"
  password: "HIDDEN"

routes:
  route-rule-1:
    schema-pattern: "abc-workorder"
    target-schema: "dev_abc_workorder"

black-white-list:
  bw-rule-1:
    do-dbs: ["abc-workorder"]

mysql-instances:
  - source-id: "mysql-replica-01"
    black-white-list: "bw-rule-1"
    route-rules: ["route-rule-1"]
    mydumper-thread: 4
    loader-thread: 16
    syncer-thread: 16
  1. What did you expect to see? start-task conf/test1.yaml should bu stage Running.

  2. What did you see instead?

    
    query-status test1

{ "result": true, "msg": "", "workers": [ { "result": true, "worker": "127.0.0.1:8262", "msg": "", "subTaskStatus": [ { "name": "test1", "stage": "Paused", "unit": "Dump", "result": { "isCanceled": false, "errors": [ { "Type": "UnknownError", "msg": "[code=32001:class=dump-unit:scope=internal:level=high] mydumper runs with error: exit status 1. \n\n", "error": null } ], "detail": null }, "unresolvedDDLLockID": "", "dump": {

                }
            }
        ],
        "relayStatus": {
            "masterBinlog": "(mysql-bin.000010, 641453056)",
            "masterBinlogGtid": "08e024dd-22db-11ea-837a-00163e04dda5:1-3918708",
            "relaySubDir": "0652de9c-ddd6-11e9-8f03-00163e00241b.000001",
            "relayBinlog": "(mysql-bin.000010, 641453056)",
            "relayBinlogGtid": "",
            "relayCatchUpMaster": true,
            "stage": "Running",
            "result": null
        },
        "sourceID": "mysql-replica-01"
    }
]

}


Using raw mydumper command works. `./mydumper -B abc-workorder`

Turn log level to debug, the key point may table list args

dm-worker-2020-07-10T10-56-27.519.log:[2020/07/10 10:56:11.388 +08:00] [INFO] [mydumper.go:276] ["Tables needed to dump are not given, now we will start to generate table list that mydumper needs to dump through black-white list from given fromDB"] [task=test1] [unit=dump] dm-worker-2020-07-10T10-56-27.519.log:[2020/07/10 10:56:11.407 +08:00] [INFO] [mydumper.go:284] ["create mydumper"] [task=test1] [unit=dump] [argument="[--host,172.19.67.154,--port,3306,--user,root,--outputdir,./dumped_data.test1,--logfile,/dev/stderr,--verbose,3,--threads,4,--chunk-filesize,64,--skip-tz-utc,--tables-list,abc-workorder.wo_workorder_schedule_record,abc-workorder.wo_workorder_service,abc-workorder.wo_workorder_evaluation,abc-workorder.wo_workorder_authority,abc-workorder.wo_workorder_in,abc-workorder.wo_workorder_in_temp,abc-workorder.wo_workorder_log,abc-workorder.wo_workorder_out,abc-workorder.wo_workorder_out1,abc-workorder.wo_workorder_schedule_log,abc-workorder.et_product_order,abc-workorder.wo_workorder_schedule_result,abc-workorder.wo_workorder_schedule,abc-workorder.wo_workorder_schedule_service_record,abc-workorder.wo_workorder_config,abc-workorder.wo_workorder_history,abc-workorder.wo_workorder_schedule_service,abc-workorder.wo_care_receiver_sign]"]

dm-worker-2020-07-10T10-56-27.519.log:[2020/07/10 10:56:14.595 +08:00] [ERROR] [mydumper.go:170] ["Error: DB: abc-workorder - Could not execute query: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '-workorder LIKE 'wo_workorder_schedule_record'' at line 1"] [task=test1] [unit=dump]


Then I tried to use `` to wrap my db-name, the task status is `Running` now, but no data is synced and the there is no sql file in
 `dumped_data.*` dir.

name: test1 task-mode: all

target-database: host: "PRIVATE_IP" port: 4000 user: "root" password: "HIDDEN"

routes: route-rule-1: schema-pattern: "abc-workorder" target-schema: "dev_abc_workorder"

black-white-list: bw-rule-1: do-dbs: ["abc-workorder"]

mysql-instances:

Raw log

dm-worker.log:[2020/07/10 11:09:39.388 +08:00] [INFO] [mydumper.go:276] ["Tables needed to dump are not given, now we will start to generate table list that mydumper needs to dump through black-white list from given fromDB"] [task=test2] [unit=dump]
dm-worker.log:[2020/07/10 11:09:39.397 +08:00] [INFO] [mydumper.go:284] ["create mydumper"] [task=test2] [unit=dump] [argument="[--host,172.19.67.154,--port,3306,--user,root,--outputdir,./dumped_data.test2,--logfile,/dev/stderr,--verbose,3,--threads,4,--chunk-filesize,64,--skip-tz-utc,--tables-list,]"]
  1. Versions of the cluster

    • DM version (run dmctl -V or dm-worker -V or dm-master -V):

      Release Version: v1.0.6
      Git Commit Hash: eaf2683c05ab44143bfb286bfbbc3ba157c555cc
      Git Branch: release-1.0
      UTC Build Time: 2020-06-17 10:22:01
      Go Version: go version go1.13 linux/amd64
    • Upstream MySQL/MariaDB server version:

      5.7.17-log
    • Downstream TiDB cluster version (execute SELECT tidb_version(); in a MySQL client):

      4.0.2
    • How did you deploy DM: DM-Ansible or manually?

      (leave DM-Ansible or manually here)
    • Other interesting information (system version, hardware config, etc):

  2. current status of DM cluster (execute query-status in dmctl)

  3. Operation logs

    • Please upload dm-worker.log for every DM-worker instance if possible
    • Please upload dm-master.log if possible
    • Other interesting logs
    • Output of dmctl's commands with problems
  4. Configuration of the cluster and the task

    • dm-worker.toml for every DM-worker instance if possible
    • dm-master.toml for DM-master if possible
    • task config, like task.yaml if possible
    • inventory.ini if deployed by DM-Ansible
  5. Screenshot/exported-PDF of Grafana dashboard or metrics' graph in Prometheus for DM if possible

elvizlai commented 4 years ago

I tried using ./mydumper --tables-list abc-workorder.wo_workorder_schedule It turns out

** (mydumper:27463): CRITICAL **: 14:05:35.041: Error: DB: abc-workorder - Could not execute query: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '-workorder LIKE 'wo_workorder_schedule'' at line 1

I compile the last version of mydumper. It still now worked.

But using ./mydumper -B abc-workorder --tables-list wo_workorder_schedule works.

Any suggestions? @lance6716

lance6716 commented 4 years ago

problem reproduced, still looking at code @elvizlai

elvizlai commented 4 years ago

Similar issue.

routes:
  route-rule-1:
    schema-pattern: "abc_mall_serviceTrade"
    target-schema: "dev_abc_mall_trade"

black-white-list:
  bw-rule-1:
    do-dbs: ["abc_mall_serviceTrade"]

The checked database became all lower case abc_mall_servicetrade

check-task conf/test2.yaml

{
    "result": false,
    "msg": "[code=10001:class=dm-master:scope=upstream:level=high] fail to initial checker: database driver error: Error 1049: Unknown database 'abc_mall_servicetrade'\ngithub.com/pingcap/dm/pkg/terror.(*Error).Delegate\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/terror/terror.go:271\ngithub.com/pingcap/dm/pkg/terror.DBErrorAdaptArgs\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/terror/adapter.go:39\ngithub.com/pingcap/dm/pkg/terror.DBErrorAdapt\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/terror/adapter.go:46\ngithub.com/pingcap/dm/pkg/utils.FetchAllDoTables\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/utils/common.go:98\ngithub.com/pingcap/dm/pkg/utils.FetchTargetDoTables\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/utils/common.go:125\ngithub.com/pingcap/dm/checker.(*Checker).Init\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/checker/checker.go:183\ngithub.com/pingcap/dm/checker.CheckSyncConfig\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/checker/cmd.go:61\ngithub.com/pingcap/dm/dm/master.(*Server).generateSubTask\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/dm/master/server.go:1876\ngithub.com/pingcap/dm/dm/master.(*Server).CheckTask\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/dm/master/server.go:1835\ngithub.com/pingcap/dm/dm/pb._Master_CheckTask_Handler\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/dm/pb/dmmaster.pb.go:2643\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/grpc@v1.25.1/server.go:1007\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/grpc@v1.25.1/server.go:1287\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\t/go/pkg/mod/google.golang.org/grpc@v1.25.1/server.go:722\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1357"
}
lance6716 commented 4 years ago

Hi, are you using DM in test environment? Haven't found a convenient way to stepaside --tables-list of mydumper. In DM 2.0 we use dumpling substitute for mydumper, and dumpling support this case. So you may try DM 2.0.

Or you can try use a dumpling binary as symbol link for mydumper to see if it works.

elvizlai commented 4 years ago

@lance6716 Any doc about DM 2.0? The dm-worker/dm-master config file seems not compatible.

Link dumpling to mydumper not working.

"msg": "[code=32001:class=dump-unit:scope=internal:level=high] mydumper runs with error: exit status 2. unknown flag: --outputdir\nunknown flag: --outputdir\nDumpling is a CLI tool that helps you dump MySQL/TiDB data\n\nUsage:\n  dumpling [flags]\n\nFlags:\n      --ca string                   The path name to the certificate authority file for TLS connection\n      --case-sensitive              whether the filter should be case-sensitive\n      --cert string                 The path name to the client certificate file for TLS connection\n      --consistency string          Consistency level during dumping: {auto|none|flush|lock|snapshot} (default \"auto\")\n      --csv-null-value string       The null value used when export to csv (default \"\\\\N\")\n  -B, --database strings            Databases to dump\n      --dump-empty-database         whether to dump empty database (default true)\n      --escape-backslash            use backslash to escape special characters (default true)\n  -F, --filesize string             The approximate size of output file\n      --filetype string             The type of export file (sql/csv) (default \"sql\")\n  -f, --filter stringArray          filter to select which tables to dump (default [*.*])\n  -h, --host string                 The host to connect to (default \"127.0.0.1\")\n      --key string                  The path name to the client private key file for TLS connection\n  -L, --logfile path                Log file path, leave empty to write to console\n      --logfmt format               Log format: {text|json} (default \"text\")\n      --loglevel string             Log level: {debug|info|warn|error|dpanic|panic|fatal} (default \"info\")\n  -d, --no-data                     Do not dump table data\n      --no-header                   whether not to dump CSV table header\n  -m, --no-schemas                  Do not dump table schemas with the data\n  -W, --no-views                    Do not dump views (default true)\n  -o, --output string               Output directory (default \"./export-2020-07-10T14:55:06+08:00\")\n  -p, --password string             User password\n  -P, --port int                    TCP/IP port to connect to (default 4000)\n  -r, --rows uint                   Split table into chunks of this many rows, default unlimited\n      --snapshot string             Snapshot position (uint64 from pd timestamp for TiDB). Valid only when consistency=snapshot\n  -S, --sql string                  Dump data with given sql. This argument doesn't support concurrent dump\n  -s, --statement-size uint         Attempted size of INSERT statement in bytes\n      --status-addr string          dumpling API server and pprof addr (default \":8281\")\n  -T, --tables-list strings         Comma delimited table list to dump; must be qualified table names\n  -t, --threads int                 Number of goroutines to use, default 4 (default 4)\n      --tidb-mem-quota-query uint   The maximum memory limit for a single SQL statement, in bytes. Default: 32GB (default 34359738368)\n  -u, --user string                 Username with privileges to run the dump (default \"root\")\n  -V, --version                     Print Dumpling version\n      --where string                Dump only selected records\n",
lance6716 commented 4 years ago

@lance6716 Any doc about DM 2.0? The dm-worker/dm-master config file seems not compatible.

Link dumpling to mydumper not working.

"msg": "[code=32001:class=dump-unit:scope=internal:level=high] mydumper runs with error: exit status 2. unknown flag: --outputdir\nunknown flag: --outputdir\nDumpling is a CLI tool that helps you dump MySQL/TiDB data\n\nUsage:\n  dumpling [flags]\n\nFlags:\n      --ca string                   The path name to the certificate authority file for TLS connection\n      --case-sensitive              whether the filter should be case-sensitive\n      --cert string                 The path name to the client certificate file for TLS connection\n      --consistency string          Consistency level during dumping: {auto|none|flush|lock|snapshot} (default \"auto\")\n      --csv-null-value string       The null value used when export to csv (default \"\\\\N\")\n  -B, --database strings            Databases to dump\n      --dump-empty-database         whether to dump empty database (default true)\n      --escape-backslash            use backslash to escape special characters (default true)\n  -F, --filesize string             The approximate size of output file\n      --filetype string             The type of export file (sql/csv) (default \"sql\")\n  -f, --filter stringArray          filter to select which tables to dump (default [*.*])\n  -h, --host string                 The host to connect to (default \"127.0.0.1\")\n      --key string                  The path name to the client private key file for TLS connection\n  -L, --logfile path                Log file path, leave empty to write to console\n      --logfmt format               Log format: {text|json} (default \"text\")\n      --loglevel string             Log level: {debug|info|warn|error|dpanic|panic|fatal} (default \"info\")\n  -d, --no-data                     Do not dump table data\n      --no-header                   whether not to dump CSV table header\n  -m, --no-schemas                  Do not dump table schemas with the data\n  -W, --no-views                    Do not dump views (default true)\n  -o, --output string               Output directory (default \"./export-2020-07-10T14:55:06+08:00\")\n  -p, --password string             User password\n  -P, --port int                    TCP/IP port to connect to (default 4000)\n  -r, --rows uint                   Split table into chunks of this many rows, default unlimited\n      --snapshot string             Snapshot position (uint64 from pd timestamp for TiDB). Valid only when consistency=snapshot\n  -S, --sql string                  Dump data with given sql. This argument doesn't support concurrent dump\n  -s, --statement-size uint         Attempted size of INSERT statement in bytes\n      --status-addr string          dumpling API server and pprof addr (default \":8281\")\n  -T, --tables-list strings         Comma delimited table list to dump; must be qualified table names\n  -t, --threads int                 Number of goroutines to use, default 4 (default 4)\n      --tidb-mem-quota-query uint   The maximum memory limit for a single SQL statement, in bytes. Default: 32GB (default 34359738368)\n  -u, --user string                 Username with privileges to run the dump (default \"root\")\n  -V, --version                     Print Dumpling version\n      --where string                Dump only selected records\n",

Docs are provided in https://docs.pingcap.com/, DM 2.0 (dev) doc, you could compare it with DM 1.0 doc.

We're plan to release 2.0 in next few months, if you found there're improvements in doc or any thing else, welcome to report in github or AskTUG

elvizlai commented 4 years ago

Sorry to interrupt, @lance6716

The DM dev doc is really confusing. Where can I place or config datasource. [solved] using cmd ./bin/dmctl --master-addr=192.168.0.4:8261 operate-source create conf/source1.toml

And the release beta not contains dumpling(but mydumper), should I download it manually?

lance6716 commented 4 years ago

And the release beta not contains dumpling(but mydumper), should I download it manually?

no need to downlaod dumpling, it's developed by us and embedded into DM.

The DM dev doc is really confusing. Where can I place or config datasource. [solved] using cmd ./bin/dmctl --master-addr=192.168.0.4:8261 operate-source create conf/source1.toml

sorry for unclear doc, do you have some idea to improve it? you could send a PR here or address in this issue.

elvizlai commented 4 years ago

database named like abc-xyz using DM2.0 solved.

But has error

packet for query is too large. Try adjusting the 'max_allowed_packet' variable on the server"

should I set TiDB side?

set @@global.max_allowed_packet=134217728

And the upper-lower database naming still not working.

routes:
  route-rule-1:
    schema-pattern: "abc_mall_serviceTrade"
    target-schema: "dev_abc_mall_trade"

black-white-list:
  bw-rule-1:
    do-dbs: ["abc_mall_serviceTrade"]
"msg": "[code=10001:class=dm-master:scope=upstream:level=high] fail to initial checker: database driver error: Error 1049: Unknown database 'abc_mall_servicetrade'"
lance6716 commented 4 years ago

database name abc-xyz using DM2.0 solved.

but up-lower related database still not working.

routes:
  route-rule-1:
    schema-pattern: "abc_mall_serviceTrade"
    target-schema: "dev_abc_mall_trade"

black-white-list:
  bw-rule-1:
    do-dbs: ["abc_mall_serviceTrade"]
"msg": "[code=10001:class=dm-master:scope=upstream:level=high] fail to initial checker: database driver error: Error 1049: Unknown database 'abc_mall_servicetrade'"

I don't understand, do you have a upper-case database name or task.yaml can't use upper-case? you may check https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_lower_case_table_names

elvizlai commented 4 years ago

database name abc-xyz using DM2.0 solved. but up-lower related database still not working.

routes:
  route-rule-1:
    schema-pattern: "abc_mall_serviceTrade"
    target-schema: "dev_abc_mall_trade"

black-white-list:
  bw-rule-1:
    do-dbs: ["abc_mall_serviceTrade"]
"msg": "[code=10001:class=dm-master:scope=upstream:level=high] fail to initial checker: database driver error: Error 1049: Unknown database 'abc_mall_servicetrade'"

I don't understand, do you have a upper-case database name or task.yaml can't use upper-case? you may check https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_lower_case_table_names

Our database using named abc_mall_serviceTrade

But when using DM, the log shows it try fetching database abc_mall_servicetrade, and the result is not existing.

lance6716 commented 4 years ago

database named like abc-xyz using DM2.0 solved.

But has error

packet for query is too large. Try adjusting the 'max_allowed_packet' variable on the server"

should I set TiDB side?

set @@global.max_allowed_packet=134217728

please provide more details, such as table structure and log file.

lance6716 commented 4 years ago

Our database using named abc_mall_serviceTrade But when using DM, the log shows it try fetching database abc_mall_servicetrade, and the result is not existing.

answered here: https://asktug.com/t/topic/35497 you could join AskTUG and get more help from community.

elvizlai commented 4 years ago

database named like abc-xyz using DM2.0 solved. But has error

packet for query is too large. Try adjusting the 'max_allowed_packet' variable on the server"

should I set TiDB side?

set @@global.max_allowed_packet=134217728

please provide more details, such as table structure and log file.

Jietu20200710-160654

lance6716 commented 4 years ago

database named like abc-xyz using DM2.0 solved. But has error

packet for query is too large. Try adjusting the 'max_allowed_packet' variable on the server"

should I set TiDB side?

set @@global.max_allowed_packet=134217728

please provide more details, such as table structure and log file.

Jietu20200710-160654

https://docs.pingcap.com/zh/tidb/stable/load-misuse-handling#%E6%8A%A5%E9%94%99%EF%BC%9Atry-adjusting-the-max_allowed_packet-variable

elvizlai commented 4 years ago

I had followed instruct set @@global.max_allowed_packet=134217728 on our TiDB server side.

But still not solved.

根据实际情况为 Loader 的配置文件或者 DM task 配置文件中的 db 配置增加类似 max-allowed-packet=128M,然后重启进程或者任务

But DM2 seems not provided such tuning params or not in doc, maybe I'll go AskTUG to discuss.

lance6716 commented 4 years ago

https://github.com/pingcap/dm/blob/95d37b4f8cf36c5da84714f9c25d7502028bc835/tests/load_interrupt/conf/dm-task.yaml#L31

--statement-size of DM 2.0 is here, try to set it to about 1000000 according to doc. BTW, you could grep and see your current value in log.

We're going to improve doc after finishing due develop work, please help us refine doc if you have some spare time 😄

elvizlai commented 4 years ago

@lance6716 Thanks, more question will discussing in TUG, pls close this issue.