Closed mrob95 closed 7 months ago
I bisected this. On 0.7.7 running the above code succeeds, and looking at the searches table I can see that the usecount
for the search is 2 as expected. On 0.7.9 it fails and usecount
is only 1. I think this means it was introduced in https://github.com/stac-utils/pgstac/pull/183/commits/194b08b9094830734def3902fc0b8caa21686117#diff-cd06c27f5f0a39c747525cce5814f41f9bc0245fe8317c7c8f163984bd8dcdea
Just for the record, we're experiencing this issue as well.
@mrob95 So far I have not been able to replicate this using your script (or just running in parallel using the following bash script) even cranking the number of parallel processes up to 12.
#!/bin/bash
q="SELECT count(*) from search('{\"limit\":1}');"
psql -c "$q" &
psql -c "$q" &
psql -c "$q" &
psql -c "$q" &
psql -c "$q" &
psql -c "$q" &
If anything, I would guess that that bit of code that you found is the culprit. Can you see if you can replicate if you change the search_query function out to this which should ensure that the search variable doesn't get overwritten?
CREATE OR REPLACE FUNCTION search_query(
_search jsonb = '{}'::jsonb,
updatestats boolean = false,
_metadata jsonb = '{}'::jsonb
) RETURNS searches AS $$
DECLARE
search searches%ROWTYPE;
cached_search searches%ROWTYPE;
pexplain jsonb;
t timestamptz;
i interval;
_hash text := search_hash(_search, _metadata);
doupdate boolean := FALSE;
insertfound boolean := FALSE;
ro boolean := pgstac.readonly();
BEGIN
IF ro THEN
updatestats := FALSE;
END IF;
SELECT * INTO search FROM searches
WHERE hash=_hash;
search.hash := _hash;
-- Calculate the where clause if not already calculated
IF search._where IS NULL THEN
search._where := stac_search_to_where(_search);
ELSE
doupdate := TRUE;
END IF;
-- Calculate the order by clause if not already calculated
IF search.orderby IS NULL THEN
search.orderby := sort_sqlorderby(_search);
ELSE
doupdate := TRUE;
END IF;
PERFORM where_stats(search._where, updatestats, _search->'conf');
IF NOT ro THEN
IF NOT doupdate THEN
INSERT INTO searches (search, _where, orderby, lastused, usecount, metadata)
VALUES (_search, search._where, search.orderby, clock_timestamp(), 1, _metadata)
ON CONFLICT (hash) DO NOTHING RETURNING * INTO cached_search;
IF FOUND THEN
RETURN cached_search;
END IF;
END IF;
UPDATE searches
SET
lastused=clock_timestamp(),
usecount=usecount+1
WHERE hash=(
SELECT hash FROM searches
WHERE hash=_hash
FOR UPDATE SKIP LOCKED
);
IF NOT FOUND THEN
RAISE NOTICE 'Did not update stats for % due to lock. (This is generally OK)', _search;
END IF;
END IF;
RETURN search;
END;
$$ LANGUAGE PLPGSQL SECURITY DEFINER;
cc @drnextgis :point_up:
Thanks, @bitner. I'll give it a look. I can replicate the issue easily following @mrob95's suggestion:
$ go run main.go
2024/04/17 10:53:21 Successfully connected!
2024/04/17 10:53:21 Search query: "SELECT search('{\"limit\": 1}'::jsonb);"
2024/04/17 10:53:21 [0] Result: {"next": null, "prev": null, "type": "FeatureCollection", "context": {"limit": 1, "returned": 0}, "features": []}
2024/04/17 10:53:21 [1] Error: pq: syntax error at or near "LIMIT"
exit status 1
This occurs right after the initial execution. To reproduce it, I need to clear the searches
table.
@bitner, with the changes you suggested, I can't seem to reproduce the issue, so it looks like it did the trick. However, I have a question. After the initial execution, the value of usecount
is 2:
stac=# select * from searches;
hash | search | _where | orderby | lastused | usecount | metadata
----------------------------------+--------------+--------+------------------------+------------------------------+----------+----------
c53f4ebe9b2a50bc2b52fd88a5d503e1 | {"limit": 1} | TRUE | datetime DESC, id DESC | 2024-04-17 08:59:49.10575+00 | 2 | {}
(1 row)
But after the subsequent execution, it increases by 1. Shouldn't it increase by 2?
stac=# select * from searches;
hash | search | _where | orderby | lastused | usecount | metadata
----------------------------------+--------------+--------+------------------------+-------------------------------+----------+----------
c53f4ebe9b2a50bc2b52fd88a5d503e1 | {"limit": 1} | TRUE | datetime DESC, id DESC | 2024-04-17 09:00:04.022792+00 | 3 | {}
(1 row)
Ideally, it should be starting at 1 and increase by 1 each time the same query comes. I do have a "skip locked" in the update, so if there is contention, it will just skip updating the usecount. The usecount is really a "nice to have" to get a general sense of what queries are being used the most and at least for my use cases, I'd rather just not update that column than need to wait for any locks to go away when there are concurrent identical queries.
Thanks for tracking this down!
@bitner, is there a way to prevent the searches
table from growing infinitely, or should it be periodically cleaned up manually?
@drnextgis I think that depends on the use case. If you are using with titiler-pgstac or anything else that is using the query hash to look up a query, clearing out the searches table would prevent being able to look those up. If you know that you are generally setting the metadata field when registering a search, you can likely safely get rid of anything that has metadata = {} or is null. The search_wheres table (if you are using context) can safely be cleared out on a cron using something like DELETE FROM search_wheres WHERE statslastupdated < now() - context_stats_ttl;
This is quite a tricky one to reproduce. We are running pgstac in production and very occasionally get
pq: syntax error at end of input
errors from pgstacsearch
queries. The queries look well formed when we log them.Looking at the database logs I can see:
We are running version 0.8.1, so that line is here - https://github.com/stac-utils/pgstac/blob/v0.8.1/src/pgstac/sql/004_search.sql#L497
I stared at this for a while and couldn't figure out how that where clause could be an empty string, so I started trying to reproduce it locally.
I ended up running
0.8.2
in a container with:Then:
main.go
import ( "database/sql" "encoding/json" "fmt" "log" "sync"
)
const ( host = "localhost" port = 5432 user = "admin" password = "admin" dbname = "stac" )
// docker run -e POSTGRES_DB=stac -e POSTGRES_PASSWORD=admin -e POSTGRES_USER=admin -e PGUSER=admin -e PGPASSWORD=admin -e PGDATABASE=stac -p 5432:5432 ghcr.io/stac-utils/pgstac:v0.8.2 postgres -c 'log_statement=all'
func main() { psqlInfo := fmt.Sprintf("host=%s port=%d user=%s "+ "password=%s dbname=%s sslmode=disable", host, port, user, password, dbname)
}
2024/01/18 22:34:44 Successfully connected! 2024/01/18 22:34:44 Search query: "SELECT search('{\"limit\": 1}'::jsonb);" 2024/01/18 22:34:44 [1] Result: {"next": null, "prev": null, "type": "FeatureCollection", "context": {"limit": 1, "returned": 0}, "features": []} 2024/01/18 22:34:44 [0] Error: pq: syntax error at or near "LIMIT" exit status 1
2024-01-18 22:34:44.440 UTC [96] STATEMENT: SELECT search('{"limit": 1}'::jsonb); 2024-01-18 22:34:44.442 UTC [96] LOG: QUERY: SELECT * FROM items WHERE TRUE ORDER BY LIMIT '2'
2024-01-18 22:34:44.442 UTC [96] CONTEXT: PL/pgSQL function search_rows(text,text,text[],integer) line 82 at RAISE SQL statement "SELECT jsonb_agg(format_item(i, _fields, hydrate)) FROM search_rows( full_where, orderby, search_where.partitions, _querylimit ) as i" PL/pgSQL function search(jsonb) line 69 at SQL statement 2024-01-18 22:34:44.442 UTC [96] STATEMENT: SELECT search('{"limit": 1}'::jsonb); 2024-01-18 22:34:44.443 UTC [96] ERROR: syntax error at or near "LIMIT" at character 77 2024-01-18 22:34:44.443 UTC [96] QUERY: SELECT * FROM items WHERE TRUE ORDER BY LIMIT '2'