Closed roshkins closed 2 years ago
Hi @roshkins !
The query is tough because of several reasons:
LIKE
in WHERE
clause usually slows everything downLIMIT
also slows everything down, sometimes dramatically (see https://dba.stackexchange.com/questions/208065/why-is-limit-killing-performance-of-this-postgres-query/208072)Sometimes it helps to take only limited piece of time (WHERE timestamp > ... AND timestamp < ...
). The most important part here is to get rid of the last 10-20 minutes of blockchain, that is usually the reason for conflicts while performing the queries.
Could you please re-check the idea of the query? You use COUNT
+ LIMIT 10
, and you do not use GROUP BY
column for printing. I ran the query and got just a column of random numbers
@roshkins Well, slow queries are going to be slow and we cannot afford letting running them forever (there is also PostgreSQL replica limitation gets into the play here - long-running queries block new data synced from the master database instance).
You will need to slim your query down and compute it in chunks (e.g. compute the stats per day/hour and then sum those values together). See https://github.com/telezhnaya/near-analytics/blob/main/aggregations/db_tables/daily_gas_used.py for inspiration.
I ran this query:
and got this after trying a few times and getting a timeout after 30 seconds:
I am thinking that this might be an indexing problem. I tried it with GROUP BY and the JSON constraint commented out and I still got the error.
I am willing to try poking around to see if adding indices fixes the issue. I am no DB expert, and would love help and advice.