cloudspannerecosystem / spanner-dump

Command line tool for exporting a Cloud Spanner database in text format
Apache License 2.0
39 stars 13 forks source link

Invalid order of tables data dump #22

Open amb-jarek opened 1 year ago

amb-jarek commented 1 year ago

It looks data are dumped from tables in alphabetic orders therefore sometimes created dump cannot be imported.

CREATE TABLE consumer (
  id STRING(36) NOT NULL,
  merchant_id STRING(36) NOT NULL,
  company_id STRING(36) NOT NULL,
  external_id STRING(36) NOT NULL,
  type STRING(10) NOT NULL,
  status STRING(8),
  created_at TIMESTAMP NOT NULL,
) PRIMARY KEY(id);
CREATE TABLE account (
  id STRING(36) NOT NULL,
  merchant_id STRING(36) NOT NULL,
  company_id STRING(36) NOT NULL,
  type STRING(10) NOT NULL,
  status STRING(8),
  currency STRING(3) NOT NULL,
  balance FLOAT64 DEFAULT (0.0),
  available_amount FLOAT64 DEFAULT (0.0),
  activated_amount FLOAT64 DEFAULT (0.0),
  blocked_amount FLOAT64 DEFAULT (0.0),
  created_at TIMESTAMP NOT NULL,
) PRIMARY KEY(id);
CREATE TABLE account_owner (
  consumer_id STRING(36) NOT NULL,
  account_id STRING(36) NOT NULL,
  FOREIGN KEY(consumer_id) REFERENCES consumer(id),
  FOREIGN KEY(account_id) REFERENCES account(id),
) PRIMARY KEY(consumer_id, account_id);
INSERT INTO `account` (`company_id`, `balance`, `created_at`, `type`, `available_amount`, `id`, `status`, `activated_amount`, `merchant_id`, `currency`, `blocked_amount`) VALUES ("d2297748-10c6-11ee-be56-0242ac120002", 99.99, TIMESTAMP "2023-07-10T11:41:23.3128747Z", "PERSONAL", 0, "282de495-94eb-4d0c-88cc-eabfd1ab246f", "ACTIVE", 0, "2c2f18de-10c6-11ee-be56-0242ac120002", "EUR", 0);
INSERT INTO `account_owner` (`account_id`, `consumer_id`) VALUES ("282de495-94eb-4d0c-88cc-eabfd1ab246f", "791c5120-f641-438a-890e-c0c2686ae9b6");
INSERT INTO `consumer` (`merchant_id`, `status`, `company_id`, `created_at`, `external_id`, `id`, `type`) VALUES 
("2c2f18de-10c6-11ee-be56-0242ac120002", "ACTIVE", "d2297748-10c6-11ee-be56-0242ac120002", TIMESTAMP "2023-07-10T11:41:23.2741788Z", "USER1", "791c5120-f641-438a-890e-c0c2686ae9b6", "INDIVIDUAL");

As you see tables itself are created correctly consumer, account, account_owner But data of those tables are exported in alphabetic order of tables: account, account_owner, consumer This leads to SQL problems during import because data from account_owner is trying to set foreign key to consumer table record, but this record doesn't exists yet

yfuruyama commented 1 year ago

@amb-jarek Thank you for reporting this. As documented in README, currently the data order of foreign key is not respected.

So to dump the data with expected order, please run spanner-dump multiple times with --tables option so that the data order can be manually specified.

For example, the first run with --tables=consumer,account and the second run with --tables=account_owner.

amb-jarek commented 1 year ago

Thank you for fast response.

Yep this is exactly what I'm doing right now. But it is rather workaround. Would be great if I could execute spanner-dump once with preferred order of tabled to dump set in --tables parameter like: spanner-dump ... --tables=consumer,account,account_owner Unfortunately it doesn't work like that and data is dumped alphabetically

yfuruyama commented 1 year ago

Technically it's not an alphabetical order. Currently the parent table is dumped first before child tables are dumped: https://github.com/cloudspannerecosystem/spanner-dump/blob/master/table.go#L55-L65

So interleaved table order is preserved when dumping the data.