pingcap-inc / tidb2dw

Replicate data changes from TiDB to Data Warehouses in real-time
MIT License
13 stars 7 forks source link

Bigquery - Exceeded rate limits: too many table update operations for this table. #89

Closed eugen-korentsov closed 7 months ago

eugen-korentsov commented 9 months ago

During loading of big table (40 mln rows) bigquery respond with Rate Limit error. [2024/01/18 09:33:11.501 +00:00] [INFO] [snapshot.go:123] ["Loading snapshot data into data warehouse"] [table=database.estates] [path=database.estates.0000000310000.csv] [2024/01/18 09:33:11.527 +00:00] [ERROR] [snapshot.go:125] ["Failed to load snapshot data into data warehouse"] [table=database.estates] [error="Bigquery load snapshot job completed with error: {Location: \"table.write\"; Message: \"Exceeded rate limits: too many table update operations for this table. For more information, see https://cloud.google.com/bigquery/docs/troubleshoot-quotas\"; Reason: \"rateLimitExceeded\"}"] [errorVerbose="Bigquery load snapshot job completed with error: {Location: \"table.write\"; Message: \"Exceeded rate limits: too many table update operations for this table. For more information, see https://cloud.google.com/bigquery/docs/troubleshoot-quotas\"; Reason: \"rateLimitExceeded\"}\ngithub.com/pingcap-inc/tidb2dw/pkg/bigquerysql.loadGCSFileToBigQuery\n\t/home/eugene/work/database/tidb2dw/pkg/bigquerysql/operation.go:52\ngithub.com/pingcap-inc/tidb2dw/pkg/bigquerysql.(*BigQueryConnector).LoadSnapshot\n\t/home/eugene/work/database/tidb2dw/pkg/bigquerysql/connector.go:112\ngithub.com/pingcap-inc/tidb2dw/replicate.(*SnapshotReplicateSession).Run.func2.1\n\t/home/eugene/work/database/tidb2dw/replicate/snapshot.go:124\nruntime.goexit\n\t/usr/lib/go-1.21/src/runtime/asm_amd64.s:1650"] [path=database.estates.0000000280000.csv]

eugen-korentsov commented 9 months ago

@Lloyd-Pottiger sorry again, maybe there is another quick fix available ?

Lloyd-Pottiger commented 9 months ago

https://github.com/pingcap-inc/tidb2dw/blob/482b30e1a368fc6909fe6e66f23b3fad434a6858/replicate/snapshot.go#L24

try 16 --> 8

eugen-korentsov commented 9 months ago

I tried it with 8, encountered the same issue. Also tried with 4, but still had the problem. However, with 2 looks like works fine , processed 150 csvs from 1360 till now without errors, (though slower, it's okay for us now)

eugen-korentsov commented 9 months ago

UPD: was rateLimit issue for 6 csv files out of 1300+, restarted with 1 thread.

eugen-korentsov commented 9 months ago

After starting tidb2dw process after 1 day of full sync, it started uploading of incremental parts:

[2024/01/19 11:59:16.691 +00:00] [INFO] [connector.go:145] ["Successfully merge file"] [file=database/events/447121134133706885/2024-01-19/CDC00000000000000000108.csv]
[2024/01/19 11:59:22.222 +00:00] [INFO] [connector.go:145] ["Successfully merge file"] [file=database/events/447121134133706885/2024-01-19/CDC00000000000000000109.csv]
[2024/01/19 11:59:28.973 +00:00] [INFO] [connector.go:145] ["Successfully merge file"] [file=database/events/447121134133706885/2024-01-19/CDC00000000000000000110.csv]
[2024/01/19 11:59:35.438 +00:00] [INFO] [connector.go:145] ["Successfully merge file"] [file=database/events/447121134133706885/2024-01-19/CDC00000000000000000111.csv]
[2024/01/19 11:59:43.688 +00:00] [INFO] [connector.go:145] ["Successfully merge file"] [file=database/events/447121134133706885/2024-01-19/CDC00000000000000000112.csv]

[2024/01/19 11:59:44.472 +00:00] [ERROR] [increment.go:413] ["error occurred while running increment replicate session"] [table=database.events] [error="Bigquery load snapshot job completed with error: {Location: \"load_job_per_table.long\"; Message: \"Quota exceeded: Your table exceeded quota for imports or query appends per table. For more information, see https://cloud.google.com/bigquery/docs/troubleshoot-quotas\"; Reason: \"quotaExceeded\"}"] [errorVerbose="Bigquery load snapshot job completed with error: {Location: \"load_job_per_table.long\"; Message: \"Quota exceeded: Your table exceeded quota for imports or query appends per table. For more information, see https://cloud.google.com/bigquery/docs/troubleshoot-quotas\"; Reason: \"quotaExceeded\"}\ngithub.com/pingcap-inc/tidb2dw/pkg/bigquerysql.loadGCSFileToBigQuery\n\t/home/eugene/work/database/tidb2dw/pkg/bigquerysql/operation.go:52\ngithub.com/pingcap-inc/tidb2dw/pkg/bigquerysql.(*BigQueryConnector).LoadIncrement\n\t/home/eugene/work/database/tidb2dw/pkg/bigquerysql/connector.go:130\ngithub.com/pingcap-inc/tidb2dw/replicate.(*IncrementReplicateSession).syncExecDMLEvents\n\t/home/eugene/work/database/tidb2dw/replicate/increment.go:268\ngithub.com/pingcap-inc/tidb2dw/replicate.(*IncrementReplicateSession).handleNewFiles\n\t/home/eugene/work/database/tidb2dw/replicate/increment.go:364\ngithub.com/pingcap-inc/tidb2dw/replicate.(*IncrementReplicateSession).Run\n\t/home/eugene/work/database/tidb2dw/replicate/increment.go:386\ngithub.com/pingcap-inc/tidb2dw/replicate.StartReplicateIncrement\n\t/home/eugene/work/database/tidb2dw/replicate/increment.go:412\ngithub.com/pingcap-inc/tidb2dw/cmd.Replicate.func1\n\t/home/eugene/work/database/tidb2dw/cmd/core.go:279\nruntime.goexit\n\t/usr/lib/go-1.21/src/runtime/asm_amd64.s:1650"]

It seems CDC has generated more than 1500 csv's during that time and we faced bigquery Table modifications limit - https://cloud.google.com/bigquery/quotas#standard_tables During a full dump, this error can occur if the table is too large. I tried using the --cdc.file-size 536870912 and --cdc.flush-interval 5m options in tidb2dw, but it seems they have no effect.

eugen-korentsov commented 9 months ago

We can't use it with big table, dump generates 7500 csv's, then it starts uploading files one by one and after 1500 csv's limit error. Also it is very slow, I tried to upload all csv's manually with bq load and it works very fast with wildcard, like:

# bq load --project_id test --allow_quoted_newlines --allow_jagged_rows  --null_marker='\N' --source_format=CSV tidb_prod.test gs://test-tidb2dw/test-prod/snapshot/*.csv
Waiting on bqjob_r6eb3c7a55a2f1312_0000018d4a15ad62_1 ... (65s) Current status: DONE 

65 seconds to load 300+ gb, 171+ mln rows database, and no limit issues.

Lloyd-Pottiger commented 9 months ago

Hey @eugen-korentsov, thanks for trying tidb2dw and find so many issues.

We can simply fix this issue by replacing the following code

https://github.com/pingcap-inc/tidb2dw/blob/b67731bf12d2eaa7f39ac89310d8165dc4cb381b/replicate/snapshot.go#L97-L146

with

if err := sess.DataWarehousePool.LoadSnapshot(sess.SourceTable, fmt.Sprintf("%s.%s.*", sess.SourceDatabase, sess.SourceTable)); err != nil { 
    sess.logger.Error("Failed to load snapshot data into data warehouse", zap.Error(err))
}

But it is not a suitable way for all warehouses, so I will not open a PR to main branch. We need more time to test different warehouses.

Thanks for your feedback again~

eugen-korentsov commented 8 months ago

@Lloyd-Pottiger thank you, code above works fine.

Lloyd-Pottiger commented 7 months ago

fixed by a17f4cc898b67b9c82c55244be3efb752f03bd6a