GreptimeTeam / greptimedb

An open-source, cloud-native, distributed time-series database with PromQL/SQL/Python supported. Available on GreptimeCloud.
https://greptime.com/
Apache License 2.0
3.96k stars 282 forks source link

Failed to run fuzz tests (target: `fuzz_insert_logical_table`) #4245

Open WenyXu opened 2 days ago

WenyXu commented 2 days ago
          https://github.com/GreptimeTeam/greptimedb/actions/runs/9755006232/job/26922900846?pr=4240

Is this related to https://github.com/GreptimeTeam/greptimedb/pull/4243? @WenyXu

thread '<unnamed>' panicked at tests-fuzz/targets/fuzz_insert_logical_table.rs:307:35:
fuzz test must be succeed: 0: Failed to execute query: CREATE TABLE `CupidiTaTE`(
ts TIMESTAMP(3) TIME INDEX,
val DOUBLE,
`eT` STRING,
`VoLUPTaTe` STRING,
`ASPerNatur` STRING,
`ExPLiCabo` STRING,
PRIMARY KEY(`VoLUPTaTe`, `ASPerNatur`, `eT`, `ExPLiCabo`)
)
ENGINE=metric with ("on_physical_table" = "doLORuM");, at tests-fuzz/targets/fuzz_insert_logical_table.rs:256:14
1: Database(MySqlDatabaseError { code: Some("HY000"), number: 1815, message: "Region `4647154614272(1082, 0)` already exists" })

Originally posted by @evenyag in https://github.com/GreptimeTeam/greptimedb/issues/4240#issuecomment-2201919238

WenyXu commented 2 days ago

Metasrv logs:

...
common_procedure::local::runner: Failed to execute procedure metasrv-procedure::CreateLogicalTables-483316c2-e426-45f0-996d-27adac338c5e, retry: true err=0: Procedure exec failed
2024-07-02T04:43:01.450796819Z stdout F 1: Retry later
2024-07-02T04:43:01.450803552Z stdout F 2: Failed to request RegionServer my-greptimedb-datanode-0.my-greptimedb-datanode.my-greptimedb:4001, code: The operation was cancelled, at src/client/src/region.rs:207:31
2024-07-02T04:43:01.450809032Z stdout F 3: Timeout expired, at src/client/src/error.rs:186:23
2024-07-02T04:43:01.451059994Z stdout F 2024-07-02T04:43:01.450873Z  INFO LocalManager::submit_root_procedure: common_procedure::local::runner: Procedure metasrv-procedure::CreateLogicalTables-483316c2-e426-45f0-996d-27adac338c5e retry for the 1 times after 500 millis
2024-07-02T04:43:01.964294762Z stdout F 2024-07-02T04:43:01.964048Z ERROR LocalManager::submit_root_procedure: common_procedure::local::runner: Failed to execute procedure metasrv-procedure::CreateLogicalTables-483316c2-e426-45f0-996d-27adac338c5e, retry: false err=0: Failed to execute procedure due to external error
2024-07-02T04:43:01.964319048Z stdout F 1: Failed to operate on datanode: peer-0(my-greptimedb-datanode-0.my-greptimedb-datanode.my-greptimedb:4001), at src/common/meta/src/ddl/utils.rs:36:27
2024-07-02T04:43:01.964324909Z stdout F 2: External error, at src/client/src/region.rs:62:31
2024-07-02T04:43:01.96433066Z stdout F 3: Failed to request RegionServer my-greptimedb-datanode-0.my-greptimedb-datanode.my-greptimedb:4001, code: Some entity that we attempted to create already exists, at src/client/src/region.rs:207:31
2024-07-02T04:43:01.964335188Z stdout F 4: Region `4647154614272(1082, 0)` already exists, at src/client/src/error.rs:186:23
2024-07-02T04:43:01.964561995Z stdout F 2024-07-02T04:43:01.964427Z  INFO LocalManager::submit_root_procedure: common_procedure::local::runner: Procedure metasrv-procedure::CreateLogicalTables-483316c2-e426-45f0-996d-27adac338c5e retry for the 1 times after 500 millis
2024-07-02T04:43:02.478148929Z stdout F 2024-07-02T04:43:02.477952Z  INFO LocalManager::submit_root_procedure: common_procedure::local::runner: Runner metasrv-procedure::CreateLogicalTables-483316c2-e426-45f0-996d-27adac338c5e exits

The error is returned by https://github.com/GreptimeTeam/greptimedb/blob/db5d1162f0f0087a17e4deed2b6ceeb483ac4bb7/src/metric-engine/src/engine/create.rs#L221-L224

However, we actually has a check before the region creation: https://github.com/GreptimeTeam/greptimedb/blob/db5d1162f0f0087a17e4deed2b6ceeb483ac4bb7/src/metric-engine/src/engine/create.rs#L171-L178

It seems that there are two create requests executed at same time?

Req 1                         | Req 2
Create region A               | Create region A
Check exists                  | Check exists
Create region                 |
                              | Create region (Will fail)