pingcap-inc / tidb2dw

Replicate data changes from TiDB to Data Warehouses in real-time
MIT License
13 stars 7 forks source link

Bigquery - Failed to create table: googleapi: Error 400: Syntax error #90

Closed eugen-korentsov closed 7 months ago

eugen-korentsov commented 9 months ago
2024/01/19 13:10:24.384 +00:00] [ERROR] [snapshot.go:176] ["Failed to load snapshot"] [table=database.track_changes] [error="Failed to create table: googleapi: Error 400: Syntax error: Expected \")\" or \",\" but got keyword HASH at [9:5], invalidQuery"] [errorVerbose="googleapi: Error 400: Syntax error: Expected \")\" or \",\" but got keyword HASH at [9:5], invalidQuery\ngithub.com/pingcap/errors.AddStack\n\t/home/eugene/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20231212100244-799fae176cfb/errors.go:174\ngithub.com/pingcap/errors.Trace\n\t/home/eugene/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20231212100244-799fae176cfb/juju_adaptor.go:15\ngithub.com/pingcap-inc/tidb2dw/pkg/bigquerysql.runQuery\n\t/home/eugene/work/database/tidb2dw/pkg/bigquerysql/operation.go:17\ngithub.com/pingcap-inc/tidb2dw/pkg/bigquerysql.(*BigQueryConnector).CopyTableSchema\n\t/home/eugene/work/database/tidb2dw/pkg/bigquerysql/connector.go:103\ngithub.com/pingcap-inc/tidb2dw/replicate.(*SnapshotReplicateSession).Run\n\t/home/eugene/work/database/tidb2dw/replicate/snapshot.go:88\ngithub.com/pingcap-inc/tidb2dw/replicate.StartReplicateSnapshot\n\t/home/eugene/work/database/tidb2dw/replicate/snapshot.go:175\ngithub.com/pingcap-inc/tidb2dw/cmd.Replicate.func1\n\t/home/eugene/work/database/tidb2dw/cmd/core.go:271\nruntime.goexit\n\t/usr/lib/go-1.21/src/runtime/asm_amd64.s:1650\nFailed to create table"]

Schema:
CREATE TABLE `track_changes` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `source_id` int(10) unsigned DEFAULT NULL,
  `source_estate_id` int(10) unsigned DEFAULT NULL,
  `field_name` varchar(32) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
  `old_value` text COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `new_value` text COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `protected` tinyint(4) NOT NULL DEFAULT '0',
  `hash` varchar(32) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT '',
  `created_at` datetime DEFAULT NULL,
  `updated_at` datetime DEFAULT NULL,
  PRIMARY KEY (`id`) /*T![clustered_index] CLUSTERED */,
  UNIQUE KEY `hash` (`hash`),
  KEY `source_id` (`source_id`),
  KEY `source_estate_id` (`source_estate_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci AUTO_INCREMENT=562747847

Similar errors:

Error 400: Syntax error: Expected \")\" or \",\" but got keyword NEW at [16:5]

Schema:
CREATE TABLE `processing` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `processing_date` date NOT NULL,
  `source_id` int(10) unsigned NOT NULL,
  `server` varchar(16) NOT NULL DEFAULT '',
  `pid` int(10) unsigned DEFAULT NULL,
  `status` enum('processing','done','fail') NOT NULL DEFAULT 'processing',
  `step` enum('none','prepare_price_ranges','test') NOT NULL DEFAULT 'none',
  `fail_retry` int(11) NOT NULL DEFAULT '0',
  `pages` int(10) unsigned DEFAULT NULL,
  `links` int(10) unsigned DEFAULT NULL,
  `loaded` int(10) unsigned DEFAULT NULL,
  `fail_loaded` int(10) unsigned DEFAULT NULL,
  `processed` int(10) unsigned DEFAULT NULL,
  `fail_processed` int(10) DEFAULT NULL,
  `new` int(10) DEFAULT NULL,
  `changed` int(10) DEFAULT NULL,
  `changed_photos` int(11) DEFAULT NULL,
  `backup_step` enum('archieve','upload','clean') DEFAULT NULL,
  `force_update` tinyint(4) NOT NULL DEFAULT '0',
  `deployment_id` int(10) unsigned DEFAULT NULL,
  `log` mediumtext DEFAULT NULL,
  `created_at` datetime DEFAULT NULL,
  `updated_at` datetime DEFAULT NULL,
  `target_mode` varchar(2) NOT NULL DEFAULT 'v1',
  `target_mode_version` int(3) unsigned NOT NULL DEFAULT '1',
  `pending_confirm_reason` int(10) DEFAULT NULL,
  `pending_confirmed_reasons` int(10) DEFAULT NULL,
  `pending_confirm_step` varchar(50) DEFAULT NULL,
  `ramscore` int(10) unsigned NOT NULL DEFAULT '0',
  `post_actions` mediumtext DEFAULT NULL,
  `reservations_history` text DEFAULT NULL,
  `reservation_user_id` int(10) DEFAULT NULL,
  `log_ray_id` varchar(32) DEFAULT NULL,
  `log_processing_flags` int(10) DEFAULT NULL,
  `delayed_till` datetime DEFAULT NULL,
  `workflow` varchar(20) NOT NULL DEFAULT 'normal',
  `finish_max_dt` datetime DEFAULT NULL,
  PRIMARY KEY (`id`) /*T![clustered_index] CLUSTERED */,
  UNIQUE KEY `processing_date` (`processing_date`,`source_id`),
  KEY `source_id` (`source_id`),
  KEY `status` (`status`),
  KEY `processing_key_for_servers` (`finish_max_dt`,`processing_date`,`status`),
  KEY `created_at` (`created_at`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin AUTO_INCREMENT=27157373

Error 400: Syntax error: Expected \")\" or \",\" but got keyword GROUP at [11:5],

Schema:
CREATE TABLE `users` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `erew` varchar(64) NOT NULL DEFAULT '',
  `name` varchar(64) NOT NULL DEFAULT '',
  `real_name` varchar(64) DEFAULT NULL,
  `dashboard` enum('default','sandbox') NOT NULL DEFAULT 'default',
  `slk_name` varchar(128) DEFAULT NULL,
  `rndm` varchar(64) NOT NULL DEFAULT '',
  `remember_token` varchar(100) DEFAULT NULL,
  `views` int(11) NOT NULL DEFAULT '0',
  `group` int(11) NOT NULL DEFAULT '0',
  `hidden_arbitrage` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `is_active` tinyint(4) DEFAULT '1',
  `created_at` datetime DEFAULT NULL,
  `updated_at` datetime DEFAULT NULL,
  `deleted_at` datetime DEFAULT NULL,
  `acl` varchar(100) DEFAULT NULL,
  PRIMARY KEY (`id`) /*T![clustered_index] CLUSTERED */
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin AUTO_INCREMENT=541758

googleapi: Error 400: Could not cast literal \"CURRENT_TIMESTAMP\" to type DATETIME at [16:34], invalidQuery"] [errorVerbose="googleapi: Error 400: Could not cast literal \"CURRENT_TIMESTAMP\" to type DATETIME 

Schema:
CREATE TABLE `requests` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `assignee_user_id` int(11) unsigned NOT NULL,
  `reporter_user_id` int(11) unsigned NOT NULL,
  `priority` tinyint(3) unsigned NOT NULL DEFAULT '1' COMMENT '1 - low, 2 - mid, 3 - high',
  `customer_id` varchar(255) NOT NULL,
  `source_id` int(10) unsigned DEFAULT NULL,
  `solution_id` int(11) unsigned NOT NULL,
  `status` tinyint(3) unsigned NOT NULL DEFAULT '1' COMMENT '1 - open, 2 - in progress, 3 - done',
  `label_id` int(11) unsigned NOT NULL,
  `title` text DEFAULT NULL,
  `description` text DEFAULT NULL,
  `estimate` int(10) unsigned DEFAULT NULL,
  `created_at` datetime DEFAULT NULL,
  `updated_at` datetime DEFAULT NULL,
  `reported_at` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `se_link` varchar(255) DEFAULT NULL,
  `job_id` bigint(20) unsigned DEFAULT NULL,
  PRIMARY KEY (`id`) /*T![clustered_index] CLUSTERED */
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin AUTO_INCREMENT=31964

error="Failed to create table: googleapi: Error 400: Syntax error: Expected \")\" or \",\" but got keyword GROUP at [3:5], invalidQuery"

Schema:
CREATE TABLE `types` (
  `id` int(11) unsigned NOT NULL DEFAULT '0',
  `group` varchar(36) DEFAULT NULL,
  `order` int(11) DEFAULT NULL,
  `name_en` varchar(36) DEFAULT NULL,
  `name_es` varchar(36) DEFAULT NULL,
  `name_de` varchar(36) DEFAULT NULL,
  `url_en` varchar(36) DEFAULT NULL,
  `url_es` varchar(36) DEFAULT NULL,
  `url_de` varchar(36) DEFAULT NULL,
  `group_name_en` varchar(36) DEFAULT NULL,
  PRIMARY KEY (`id`) /*T![clustered_index] CLUSTERED */
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin /* CACHED ON */
Lloyd-Pottiger commented 9 months ago

Thanks for trying tidb2dw, but recently I may do not have time to fix those issues, sorry about that......

Also, if you open a PR to fix, I will review and merge it ASAP.

breezewish commented 9 months ago

Maybe we could simply fix the issue by adding identifier escapes.

eugen-korentsov commented 9 months ago

@wd0517

still issue with

googleapi: Error 400: Could not cast literal \"CURRENT_TIMESTAMP\" to type DATETIME at [16:34], invalidQuery"] [errorVerbose="googleapi: Error 400: Could not cast literal \"CURRENT_TIMESTAMP\" to type DATETIME 

Schema:
CREATE TABLE `requests` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `assignee_user_id` int(11) unsigned NOT NULL,
  `reporter_user_id` int(11) unsigned NOT NULL,
  `priority` tinyint(3) unsigned NOT NULL DEFAULT '1' COMMENT '1 - low, 2 - mid, 3 - high',
  `customer_id` varchar(255) NOT NULL,
  `source_id` int(10) unsigned DEFAULT NULL,
  `solution_id` int(11) unsigned NOT NULL,
  `status` tinyint(3) unsigned NOT NULL DEFAULT '1' COMMENT '1 - open, 2 - in progress, 3 - done',
  `label_id` int(11) unsigned NOT NULL,
  `title` text DEFAULT NULL,
  `description` text DEFAULT NULL,
  `estimate` int(10) unsigned DEFAULT NULL,
  `created_at` datetime DEFAULT NULL,
  `updated_at` datetime DEFAULT NULL,
  `reported_at` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `se_link` varchar(255) DEFAULT NULL,
  `job_id` bigint(20) unsigned DEFAULT NULL,
  PRIMARY KEY (`id`) /*T![clustered_index] CLUSTERED */
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin AUTO_INCREMENT=31964
eugen-korentsov commented 8 months ago

@Lloyd-Pottiger @wd0517 is it possible to fix this issue with CURRENT_TIMESTAMP ?

KarynaZubko commented 7 months ago

Hello everyone! @breezewish @Lloyd-Pottiger I need your assistance here.

We faced with the same issue during an attempt to sync another one table. Error is the same as @eugen-korentsov described previously here. Could you please fix the issue on your side to proceed with table sync using CURRENT_TIMESTAMP?

wd0517 commented 7 months ago

I will try to fix this issue this week

wd0517 commented 7 months ago

I think we should consider skipping the sync of the default function between TiDB and BigQuery. They have quite different behaviors when handle the default function, essentially BigQuery is more strict, as shown below:

tidb2dw as a data changes replication tools from TiDB to data warehouse, user should not manual insert new data to warehouse. Therefore, it is safe to skip the default function.

wd0517 commented 7 months ago

This PR let tidb2dw skip sync column's default value/function, this issue can now be closed.

Lloyd-Pottiger commented 7 months ago

@eugen-korentsov @KarynaZubko

I think it is fixed by https://github.com/pingcap-inc/tidb2dw/pull/95, thanks @wd0517. current_timestamp will be ignored in create table statement, and will report an error in add column statement.