Open WenyXu opened 2 days ago
Metasrv logs:
...
common_procedure::local::runner: Failed to execute procedure metasrv-procedure::CreateLogicalTables-483316c2-e426-45f0-996d-27adac338c5e, retry: true err=0: Procedure exec failed
2024-07-02T04:43:01.450796819Z stdout F 1: Retry later
2024-07-02T04:43:01.450803552Z stdout F 2: Failed to request RegionServer my-greptimedb-datanode-0.my-greptimedb-datanode.my-greptimedb:4001, code: The operation was cancelled, at src/client/src/region.rs:207:31
2024-07-02T04:43:01.450809032Z stdout F 3: Timeout expired, at src/client/src/error.rs:186:23
2024-07-02T04:43:01.451059994Z stdout F 2024-07-02T04:43:01.450873Z INFO LocalManager::submit_root_procedure: common_procedure::local::runner: Procedure metasrv-procedure::CreateLogicalTables-483316c2-e426-45f0-996d-27adac338c5e retry for the 1 times after 500 millis
2024-07-02T04:43:01.964294762Z stdout F 2024-07-02T04:43:01.964048Z ERROR LocalManager::submit_root_procedure: common_procedure::local::runner: Failed to execute procedure metasrv-procedure::CreateLogicalTables-483316c2-e426-45f0-996d-27adac338c5e, retry: false err=0: Failed to execute procedure due to external error
2024-07-02T04:43:01.964319048Z stdout F 1: Failed to operate on datanode: peer-0(my-greptimedb-datanode-0.my-greptimedb-datanode.my-greptimedb:4001), at src/common/meta/src/ddl/utils.rs:36:27
2024-07-02T04:43:01.964324909Z stdout F 2: External error, at src/client/src/region.rs:62:31
2024-07-02T04:43:01.96433066Z stdout F 3: Failed to request RegionServer my-greptimedb-datanode-0.my-greptimedb-datanode.my-greptimedb:4001, code: Some entity that we attempted to create already exists, at src/client/src/region.rs:207:31
2024-07-02T04:43:01.964335188Z stdout F 4: Region `4647154614272(1082, 0)` already exists, at src/client/src/error.rs:186:23
2024-07-02T04:43:01.964561995Z stdout F 2024-07-02T04:43:01.964427Z INFO LocalManager::submit_root_procedure: common_procedure::local::runner: Procedure metasrv-procedure::CreateLogicalTables-483316c2-e426-45f0-996d-27adac338c5e retry for the 1 times after 500 millis
2024-07-02T04:43:02.478148929Z stdout F 2024-07-02T04:43:02.477952Z INFO LocalManager::submit_root_procedure: common_procedure::local::runner: Runner metasrv-procedure::CreateLogicalTables-483316c2-e426-45f0-996d-27adac338c5e exits
The error is returned by https://github.com/GreptimeTeam/greptimedb/blob/db5d1162f0f0087a17e4deed2b6ceeb483ac4bb7/src/metric-engine/src/engine/create.rs#L221-L224
However, we actually has a check before the region creation: https://github.com/GreptimeTeam/greptimedb/blob/db5d1162f0f0087a17e4deed2b6ceeb483ac4bb7/src/metric-engine/src/engine/create.rs#L171-L178
It seems that there are two create requests executed at same time?
Req 1 | Req 2
Create region A | Create region A
Check exists | Check exists
Create region |
| Create region (Will fail)
Is this related to https://github.com/GreptimeTeam/greptimedb/pull/4243? @WenyXu
Originally posted by @evenyag in https://github.com/GreptimeTeam/greptimedb/issues/4240#issuecomment-2201919238