Closed pdeaudney closed 8 months ago
This is from a different citus cluster specifically setup for OLAP style workloads and querying across many shards. Data is distributed by inspection_id
or id
.
Jul 13 01:30:11am AEST U132 2698548 api_analytics_queryengine ERROR: there is no parameter $1 at character 24
WITH all_records AS
(SELECT 1 AS data0,
inspections.template_id AS col0
FROM inspections
WHERE inspections.organisation_id = $1
AND (inspections.conducted_at BETWEEN $2 AND $3)),
aggregation0_result AS
(SELECT SUM(data0) AS agg0,
col0
FROM all_records
GROUP BY col0),
overall_metric AS
(SELECT COALESCE(SUM(data0), 0),
'b4c4eac4-1c29-4c0e-b0d7-fee31f083fa9'::uuid
FROM all_records)
SELECT *
FROM aggregation0_result
UNION ALL
SELECT *
FROM overall_metric
I'm wondering if there is a race condition?? I have examples of queries occurring at the same exact timestamp (same query, different parameters)
Query with the failure is 122/64937
2023-07-19 23:14:57.453 UTC [3228322] [122/64937] [0] alexandria@inspections LOG: 00000: duration: 0.098 ms parse <unnamed>:
SELECT templates.id, templates.rev, templates.json, meta.template_data as meta_template_data, meta.date_draft_modified, meta.draft_author_name
FROM templates
LEFT JOIN templates_meta as meta
ON templates.id = meta.id
WHERE (templates.id = $1 OR templates.id = $2)
2023-07-19 23:14:57.453 UTC [3228322] [122/64937] [0] alexandria@inspections LOCATION: exec_parse_message, postgres.c:1566
2023-07-19 23:14:57.453 UTC [3228315] [118/73762] [0] alexandria@inspections LOG: 00000: duration: 4.165 ms execute <unnamed>:
SELECT templates.id, templates.rev, templates.json, meta.template_data as meta_template_data, meta.date_draft_modified, meta.draft_author_name
FROM templates
LEFT JOIN templates_meta as meta
ON templates.id = meta.id
WHERE (templates.id = $1 OR templates.id = $2)
2023-07-19 23:14:57.453 UTC [3228315] [118/73762] [0] alexandria@inspections DETAIL: parameters: $1 = 'template_a98dc9e216d340f1a9fbee3b6e6a6a9c', $2 = 'template_A98DC9E216D340F1A9FBEE3B6E6A6A9C'
2023-07-19 23:14:57.453 UTC [3228315] [118/73762] [0] alexandria@inspections LOCATION: exec_execute_message, postgres.c:2269
2023-07-19 23:14:57.454 UTC [3228317] [111/55216] [0] alexandria@inspections LOG: 00000: duration: 0.217 ms parse <unnamed>:
SELECT templates.id, templates.rev, templates.json, meta.template_data as meta_template_data, meta.date_draft_modified, meta.draft_author_name
FROM templates
LEFT JOIN templates_meta as meta
ON templates.id = meta.id
WHERE (templates.id = $1 OR templates.id = $2)
2023-07-19 23:14:57.454 UTC [3228317] [111/55216] [0] alexandria@inspections LOCATION: exec_parse_message, postgres.c:1566
2023-07-19 23:14:57.455 UTC [3228312] [121/36967] [0] alexandria@inspections LOG: 00000: duration: 4.102 ms execute <unnamed>:
SELECT templates.id, templates.rev, templates.json, meta.template_data as meta_template_data, meta.date_draft_modified, meta.draft_author_name
FROM templates
LEFT JOIN templates_meta as meta
ON templates.id = meta.id
WHERE (templates.id = $1 OR templates.id = $2)
2023-07-19 23:14:57.455 UTC [3228312] [121/36967] [0] alexandria@inspections DETAIL: parameters: $1 = 'template_790cf409211e4ff4862e6afd6d2dbf7e', $2 = 'template_790CF409211E4FF4862E6AFD6D2DBF7E'
2023-07-19 23:14:57.455 UTC [3228312] [121/36967] [0] alexandria@inspections LOCATION: exec_execute_message, postgres.c:2269
2023-07-19 23:14:57.455 UTC [3228278] [113/48244] [0] alexandria@inspections LOG: 00000: duration: 0.148 ms parse <unnamed>:
SELECT templates.id, templates.rev, templates.json, meta.template_data as meta_template_data, meta.date_draft_modified, meta.draft_author_name
FROM templates
LEFT JOIN templates_meta as meta
ON templates.id = meta.id
WHERE (templates.id = $1 OR templates.id = $2)
2023-07-19 23:14:57.455 UTC [3228278] [113/48244] [0] alexandria@inspections LOCATION: exec_parse_message, postgres.c:1566
2023-07-19 23:14:57.455 UTC [3228310] [88/148262] [0] alexandria@inspections LOG: 00000: duration: 0.087 ms parse <unnamed>:
SELECT templates.id, templates.rev, templates.json, meta.template_data as meta_template_data, meta.date_draft_modified, meta.draft_author_name
FROM templates
LEFT JOIN templates_meta as meta
ON templates.id = meta.id
WHERE (templates.id = $1 OR templates.id = $2)
2023-07-19 23:14:57.455 UTC [3228310] [88/148262] [0] alexandria@inspections LOCATION: exec_parse_message, postgres.c:1566
2023-07-19 23:14:57.456 UTC [3228308] [114/54623] [0] alexandria@inspections LOG: 00000: duration: 5.822 ms execute <unnamed>:
SELECT templates.id, templates.rev, templates.json, meta.template_data as meta_template_data, meta.date_draft_modified, meta.draft_author_name
FROM templates
LEFT JOIN templates_meta as meta
ON templates.id = meta.id
WHERE (templates.id = $1 OR templates.id = $2)
2023-07-19 23:14:57.456 UTC [3228308] [114/54623] [0] alexandria@inspections DETAIL: parameters: $1 = 'template_dd3e7bb959c54d338ea622565ad80444', $2 = 'template_DD3E7BB959C54D338EA622565AD80444'
2023-07-19 23:14:57.456 UTC [3228308] [114/54623] [0] alexandria@inspections LOCATION: exec_execute_message, postgres.c:2269
2023-07-19 23:14:57.456 UTC [3228322] [122/64937] [0] alexandria@inspections LOG: 00000: duration: 1.490 ms bind <unnamed>:
SELECT templates.id, templates.rev, templates.json, meta.template_data as meta_template_data, meta.date_draft_modified, meta.draft_author_name
FROM templates
LEFT JOIN templates_meta as meta
ON templates.id = meta.id
WHERE (templates.id = $1 OR templates.id = $2)
2023-07-19 23:14:57.456 UTC [3228322] [122/64937] [0] alexandria@inspections DETAIL: parameters: $1 = 'template_0e1a1d72f46b4616ab34c989ab5a3062', $2 = 'template_0E1A1D72F46B4616AB34C989AB5A3062'
2023-07-19 23:14:57.456 UTC [3228322] [122/64937] [0] alexandria@inspections LOCATION: exec_bind_message, postgres.c:2025
2023-07-19 23:14:57.456 UTC [3228307] [85/136838] [0] alexandria@inspections LOG: 00000: duration: 0.156 ms parse <unnamed>:
SELECT templates.id, templates.rev, templates.json, meta.template_data as meta_template_data, meta.date_draft_modified, meta.draft_author_name
FROM templates
LEFT JOIN templates_meta as meta
ON templates.id = meta.id
WHERE (templates.id = $1 OR templates.id = $2)
2023-07-19 23:14:57.456 UTC [3228307] [85/136838] [0] alexandria@inspections LOCATION: exec_parse_message, postgres.c:1566
2023-07-19 23:14:57.457 UTC [3228306] [24/180830] [0] alexandria@inspections LOG: 00000: duration: 0.108 ms parse <unnamed>:
SELECT templates.id, templates.rev, templates.json, meta.template_data as meta_template_data, meta.date_draft_modified, meta.draft_author_name
FROM templates
LEFT JOIN templates_meta as meta
ON templates.id = meta.id
WHERE (templates.id = $1 OR templates.id = $2)
2023-07-19 23:14:57.457 UTC [3228306] [24/180830] [0] alexandria@inspections LOCATION: exec_parse_message, postgres.c:1566
2023-07-19 23:14:57.457 UTC [3228309] [115/40297] [0] alexandria@inspections LOG: 00000: duration: 1.484 ms bind <unnamed>:
2023-07-19 23:14:57.456 UTC [3228308] [114/54623] [0] alexandria@inspections LOG: 00000: duration: 5.822 ms execute <unnamed>:
SELECT templates.id, templates.rev, templates.json, meta.template_data as meta_template_data, meta.date_draft_modified, meta.draft_author_name
FROM templates
LEFT JOIN templates_meta as meta
ON templates.id = meta.id
WHERE (templates.id = $1 OR templates.id = $2)
2023-07-19 23:14:57.456 UTC [3228308] [114/54623] [0] alexandria@inspections DETAIL: parameters: $1 = 'template_dd3e7bb959c54d338ea622565ad80444', $2 = 'template_DD3E7BB959C54D338EA622565AD80444'
2023-07-19 23:14:57.456 UTC [3228308] [114/54623] [0] alexandria@inspections LOCATION: exec_execute_message, postgres.c:2269
2023-07-19 23:14:57.456 UTC [3228322] [122/64937] [0] alexandria@inspections LOG: 00000: duration: 1.490 ms bind <unnamed>:
SELECT templates.id, templates.rev, templates.json, meta.template_data as meta_template_data, meta.date_draft_modified, meta.draft_author_name
FROM templates
LEFT JOIN templates_meta as meta
ON templates.id = meta.id
WHERE (templates.id = $1 OR templates.id = $2)
2023-07-19 23:14:57.456 UTC [3228322] [122/64937] [0] alexandria@inspections DETAIL: parameters: $1 = 'template_0e1a1d72f46b4616ab34c989ab5a3062', $2 = 'template_0E1A1D72F46B4616AB34C989AB5A3062'
2023-07-19 23:14:57.456 UTC [3228322] [122/64937] [0] alexandria@inspections LOCATION: exec_bind_message, postgres.c:2025
2023-07-19 23:14:57.456 UTC [3228307] [85/136838] [0] alexandria@inspections LOG: 00000: duration: 0.156 ms parse <unnamed>:
SELECT templates.id, templates.rev, templates.json, meta.template_data as meta_template_data, meta.date_draft_modified, meta.draft_author_name
FROM templates
LEFT JOIN templates_meta as meta
ON templates.id = meta.id
WHERE (templates.id = $1 OR templates.id = $2)
2023-07-19 23:14:57.456 UTC [3228307] [85/136838] [0] alexandria@inspections LOCATION: exec_parse_message, postgres.c:1566
... other logs ...
2023-07-19 23:14:57.466 UTC [3228322] [122/64937] [0] alexandria@inspections ERROR: 42P02: there is no parameter $1 at character 24
2023-07-19 23:14:57.466 UTC [3228322] [122/64937] [0] alexandria@inspections CONTEXT: unnamed portal with parameters: $1 = 'template_0e1a1d72f46b4616ab34c989ab5a3062', $2 = 'template_0E1A1D72F46B4616AB34C989AB5A3062'
2023-07-19 23:14:57.466 UTC [3228322] [122/64937] [0] alexandria@inspections LOCATION: transformParamRef, parse_expr.c:842
2023-07-19 23:14:57.466 UTC [3228322] [122/64937] [0] alexandria@inspections STATEMENT:
SELECT templates.id, templates.rev, templates.json, meta.template_data as meta_template_data, meta.date_draft_modified, meta.draft_author_name
FROM templates
LEFT JOIN templates_meta as meta
ON templates.id = meta.id
WHERE (templates.id = $1 OR templates.id = $2)
Further research seems to indicate to me this is a driver issue against PostgreSQL rather than Citus.
I have multiple databases on multiple citus clusters that show the same error at a low rate.
So far I have only seen this on databases that are using Citus 11.2, PostgreSQL 14.8. And only on a few queries that seemingly involve either joins between sharded & non sharded tables. Or OLAP style queries that query all shards.
We're using different PostgreSQL drivers (node-pg/javascript) & (database/sql / jackc/sqlx on golang). The errors have been observed in 3 different code bases with two different database drivers. Drivers are updated to latest releases.
Our DB traffic flow is application -> TCP load balancer -> coordinator -> TCP load balancer -> worker node.
We are not using any postgresql aware proxying, just straight up TCP load balancing based that directs traffic to the healthy primary node from patroni. There have been no switch overs or failovers in patroni during the logged errors.
This table is sharded in Citus by the
id
column.This table is not sharded in Citus and exists on the coordinator.
Unfortunately this one hasn't occurred again with the additional debug logging enabled on this database. This table is sharded in Citus by
asset_id
(so the query is hitting all shards).Details of schema & version info can be found in https://gist.github.com/pdeaudney/28d991c3831c2f5963461dd47669bedd
Any tips or advice on how to troubleshoot this further would be appreciated.