powa-team / powa-web

PoWA user interface
http://powa.readthedocs.io/
73 stars 31 forks source link

Usage Issue #179

Closed viki00 closed 9 months ago

viki00 commented 10 months ago

Hello,

I cannot seem to get the Database Overview Page working in powa-web. I have only 1 remote server, i have registered the server.

Here is the error in the logs - it does know about column total_time..I looked thru all the SQL in powa_web and could not find total_time.

Here are my specs

powa_archivist 4_1_4 powa=# \dx List of installed extensions Name | Version | Schema | Description
--------------------+---------+------------+-------------------------------------------------------------- btree_gist | 1.5 | public | support for indexing common datatypes in GiST pg_qualstats | 2.0.4 | public | An extension collecting statistics about quals pg_repack | 1.4.7 | public | Reorganize tables in PostgreSQL databases with minimal locks pg_stat_kcache | 2.2.1 | public | Kernel statistics gathering pg_stat_statements | 1.7 | public | track execution statistics of all SQL statements executed pg_wait_sampling | 1.1 | public | sampling based statistics of wait events plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language powa | 4.1.4 | public | PostgreSQL Workload Analyser-core

Can you point me to where to look ?

Thanks Vik

Error Meesage

2023-08-24T06:19:39.667393-05:00 blank powa-web[3860077]: [I 230824 06:19:39 web:2271] 200 GET /server/all_servers/?from=2023-08-24+05%3A19%3A39-0500&to=2023-08-24+06%3A19%3A39-0500 (10.10.2 9.20) 45.83ms 2023-08-24T06:19:44.176308-05:00 blank powa-web[3860077]: [I 230824 06:19:44 web:2271] 200 GET /server/0/overview/ (10.10.29.20) 39.78ms 2023-08-24T06:19:44.458879-05:00 blank powa-web[3860077]: [E 230824 06:19:44 web:1798] Uncaught exception GET /server/0/metrics/databases_globals/?from=2023-08-24+05%3A19%3A44-0500&to=2023-0 8-24+06%3A19%3A44-0500 (10.10.29.20) 2023-08-24T06:19:44.458954-05:00 blank powa-web[3860077]: HTTPServerRequest(protocol='http', host='pg-powa-prd:8888', method='GET', uri='/server/0/metrics/databases_globals/?from=2023-08 -24+05%3A19%3A44-0500&to=2023-08-24+06%3A19%3A44-0500', version='HTTP/1.1', remote_ip='10.10.29.20') 2023-08-24T06:19:44.458980-05:00 blank powa-web[3860077]: Traceback (most recent call last): 2023-08-24T06:19:44.459006-05:00 blank powa-web[3860077]: File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 1900, in _execute_context 2023-08-24T06:19:44.459034-05:00 blank powa-web[3860077]: self.dialect.do_execute( 2023-08-24T06:19:44.459056-05:00 blank powa-web[3860077]: File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/default.py", line 736, in do_execute 2023-08-24T06:19:44.459077-05:00 blank powa-web[3860077]: cursor.execute(statement, parameters) 2023-08-24T06:19:44.459099-05:00 blank powa-web[3860077]: psycopg2.errors.UndefinedColumn: column "total_time" does not exist 2023-08-24T06:19:44.459121-05:00 blank powa-web[3860077]: LINE 2: ... ts) - sum(calls), 0) AS calls, greatest(lead(sum(total_time... 2023-08-24T06:19:44.459143-05:00 blank powa-web[3860077]: ^ 2023-08-24T06:19:44.459167-05:00 blank powa-web[3860077]: 2023-08-24T06:19:44.459190-05:00 blank powa-web[3860077]: 2023-08-24T06:19:44.459212-05:00 blank powa-web[3860077]: The above exception was the direct cause of the following exception: 2023-08-24T06:19:44.459234-05:00 blank powa-web[3860077]: 2023-08-24T06:19:44.459254-05:00 blank powa-web[3860077]: Traceback (most recent call last): 2023-08-24T06:19:44.459280-05:00 blank powa-web[3860077]: File "/usr/local/lib/python3.8/dist-packages/tornado/web.py", line 1711, in _execute 2023-08-24T06:19:44.459302-05:00 blank powa-web[3860077]: result = method(self.path_args, self.path_kwargs) 2023-08-24T06:19:44.459323-05:00 blank powa-web[3860077]: File "/usr/local/bin/powa-web-4.0.1/powa/dashboards.py", line 95, in get 2023-08-24T06:19:44.459344-05:00 blank powa-web[3860077]: values = self.execute(query, params=url_params) 2023-08-24T06:19:44.459365-05:00 blank powa-web[3860077]: File "/usr/local/bin/powa-web-4.0.1/powa/framework.py", line 343, in execute 2023-08-24T06:19:44.459392-05:00 blank powa-web[3860077]: return engine.execute(query, params) 2023-08-24T06:19:44.459413-05:00 blank powa-web[3860077]: File "", line 2, in execute 2023-08-24T06:19:44.459434-05:00 blank powa-web[3860077]: File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/util/deprecations.py", line 402, in warned 2023-08-24T06:19:44.459459-05:00 blank powa-web[3860077]: return fn(args, *kwargs) 2023-08-24T06:19:44.459482-05:00 blank powa-web[3860077]: File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 3257, in execute 2023-08-24T06:19:44.459503-05:00 blank powa-web[3860077]: return connection.execute(statement, multiparams, **params) 2023-08-24T06:19:44.459525-05:00 blank powa-web[3860077]: File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 1380, in execute 2023-08-24T06:19:44.459546-05:00 blank powa-web[3860077]: return meth(self, multiparams, params, _EMPTY_EXECUTION_OPTS) 2023-08-24T06:19:44.459568-05:00 blank powa-web[3860077]: File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/sql/elements.py", line 333, in _execute_on_connection 2023-08-24T06:19:44.459589-05:00 blank powa-web[3860077]: return connection._execute_clauseelement( 2023-08-24T06:19:44.459615-05:00 blank powa-web[3860077]: File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 1572, in _execute_clauseelement 2023-08-24T06:19:44.459636-05:00 blank powa-web[3860077]: ret = self._execute_context( 2023-08-24T06:19:44.459658-05:00 blank powa-web[3860077]: File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 1943, in _execute_context 2023-08-24T06:19:44.459679-05:00 blank powa-web[3860077]: self._handle_dbapi_exception( 2023-08-24T06:19:44.459700-05:00 blank powa-web[3860077]: File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 2124, in _handle_dbapiexception 2023-08-24T06:19:44.459721-05:00 blank powa-web[3860077]: util.raise( 2023-08-24T06:19:44.459743-05:00 blank powa-web[3860077]: File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/util/compat.py", line 208, in raise_ 2023-08-24T06:19:44.459772-05:00 blank powa-web[3860077]: raise exception 2023-08-24T06:19:44.459795-05:00 blank powa-web[3860077]: File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/base.py", line 1900, in _execute_context 2023-08-24T06:19:44.459816-05:00 blank powa-web[3860077]: self.dialect.do_execute( 2023-08-24T06:19:44.459837-05:00 blank powa-web[3860077]: File "/usr/local/lib/python3.8/dist-packages/sqlalchemy/engine/default.py", line 736, in do_execute 2023-08-24T06:19:44.459858-05:00 blank powa-web[3860077]: cursor.execute(statement, parameters) 2023-08-24T06:19:44.459880-05:00 blank powa-web[3860077]: sqlalchemy.exc.ProgrammingError: (psycopg2.errors.UndefinedColumn) column "total_time" does not exist 2023-08-24T06:19:44.459902-05:00 blank powa-web[3860077]: LINE 2: ... ts) - sum(calls), 0) AS calls, greatest(lead(sum(total_time... 2023-08-24T06:19:44.459927-05:00 blank powa-web[3860077]: ^

2023-08-24T06:19:44.459948-05:00 blank powa-web[3860077]: 2023-08-24T06:19:44.459981-05:00 blank powa-web[3860077]: [SQL: SELECT anon_1.srvid, EXTRACT(epoch FROM anon_1.ts) AS ts, sum(anon_1.calls) / greatest(EXTRACT(epoch FROM anon_1.mesure_in terval), %(greatest_1)s) AS calls, sum(anon_1.runtime) / greatest(sum(anon_1.calls), %(greatest_2)s) AS avg_runtime, sum(anon_1.runtime) / greatest(EXTRACT(epoch FROM anon_1.mesure_interval), %(great est_3)s) AS load, (sum(anon_1.shared_blks_read + anon_1.local_blks_read + anon_1.temp_blks_read) block_size.block_size) / EXTRACT(epoch FROM CASE WHEN (min(anon_1.mesure_interval) = %(min_1)s) THEN %(param_1)s ELSE min(anon_1.mesure_interval) END) AS total_blks_read, (sum(anon_1.shared_blks_hit + anon_1.local_blks_hit) block_size.block_size) / EXTRACT(epoch FROM CASE WHEN (min(anon1.mesure interval) = %(min_2)s) THEN %(param_2)s ELSE min(anon_1.mesure_interval) END) AS total_blks_hit, (sum(anon_1.shared_blks_read + anon_1.local_blks_read + anon_1.temp_blks_read) * block_size.block_size ) / EXTRACT(epoch FROM CASE WHEN (min(anon_1.mesure_interval) = %(min_3)s) THEN %(param_3)s ELSE min(anon_1.mesure_interval) END) - sum(anon_2.reads) / greatest(EXTRACT(epoch FROM anon_1.mesure_inter val), %(greatest_4)s) AS total_sys_hit, sum(anon_2.reads) / greatest(EXTRACT(epoch FROM anon_1.mesure_interval), %(greatest_5)s) AS total_disk_read, sum(anon_2.minflts) / EXTRACT(epoch FROM greatest( anon_1.mesure_interval, %(greatest_6)s)) AS minflts, sum(anon_2.majflts) / EXTRACT(epoch FROM greatest(anon_1.mesure_interval, %(greatest_7)s)) AS majflts, sum(anon_2.nvcsws) / EXTRACT(epoch FROM greatest(anon_1.mesure_interval, %(greatest_8)s)) AS nvcsws, sum(anon_2.nivcsws) / EXTRACT(epoch FROM greatest(anon_1.mesure_interval, %(greatest_9)s)) AS nivcsws

rjuju commented 10 months ago

Hi,

That's indeed strange. The total_time column was used for pg_stat_statements, but was renamed and split to total_plan_time and total_exec_time in powa 4.1.0.

I'm wondering if you have leftovers of previous powa-web installations somewhere on your machine, and for some reason one of the imported file is from an old (and incompatible version) rather than 4.1.x.

You're also mentioning that you have 1 remote server, but looking at the logs:

200 GET /server/0/overview/ (10.10.29.20) 39.78ms

The server id 0 is hardcoded for the "local server", ie. the repository server. If you registered a remote server, it should be the server id 1, or at least something different than 0. Did you setup powa-collector too?

viki00 commented 10 months ago

Thank you, i found the wrong powa-web - removed it and it works.

Yes thanks, in the GUI, i picked the local server. That is my mistake.

However, the one issue that has been plaguing me is - Bigint out of range

1882503|powa|powa|pvlchi6ppgsql1.peak6.net(44690)|2023-08-25 09:41:32 CDT|WARNING: powa_take_snapshot(9): function "powa_qualstats_snapshot" failed: state : 22003 message: bigint out of range detail : hint : context: SQL statement "WITH capture AS ( SELECT FROM powa_qualstats_src(_srvid) q WHERE EXISTS (SELECT 1 FROM powa_statements s WHERE s.srvid = _srvid AND q.queryid = s.queryid AND q.dbid = s.dbid AND q.userid = s.userid) ), missing_quals AS ( INSERT INTO public.powa_qualstats_quals (srvid, qualid, queryid, dbid, userid, quals) SELECT DISTINCT _srvid AS srvid, qs.qualnodeid, qs.queryid, qs.dbid, qs.userid, array_agg(DISTINCT q::qual_type) FROM capture qs, LATERAL (SELECT (unnest(quals)).) as q WHERE NOT EXISTS ( SELECT 1 FROM powa_qualstats_quals nh WHERE nh.srvid = _srvid AND nh.qualid = qs.qualnodeid AND nh.queryid = qs.queryid AND nh.dbid = qs.dbid AND nh.userid = qs.userid ) GROUP BY srvid, qualnodeid, qs.queryid, qs.dbid, qs.userid RETURNING ), by_qual AS ( INSERT INTO public.powa_qualstats_quals_history_current (srvid, qualid, queryid, dbid, userid, ts, occurences, execution_count, nbfiltered, mean_err_estimate_ratio, mean_err_estimate_num) SELECT _srvid AS srvid, qs.qualnodeid, qs.queryid, qs.dbid, qs.userid, ts, sum(occurences), sum(execution_count), sum(nbfiltered), avg(mean_err_estimate_ratio), avg(mean_err_estimate_num) FROM capture as qs GROUP BY srvid, ts, qualnodeid, qs.queryid, qs.dbid, qs.userid RETURNING ), by_qual_with_const AS ( INSERT INTO public.powa_qualstats_constvalues_history_current(srvid, qualid, queryid, dbid, userid, ts, occurences, execution_count, nbfiltered, mean_err_estimate_ratio, mean_err_estimate_num, constvalues) SELECT _srvid, qualnodeid, qs.queryid, qs.dbid, qs.userid, ts, occurences, execution_count, nbfiltered, mean_err_estimate_ratio, mean_err_estimate_num, constvalues FROM capture as qs ) SELECT COUNT(*) FROM capture" PL/pgSQL function powa_qualstats_snapshot(integer) line 11 at SQL statement

rjuju commented 10 months ago

Wow, it looks hard to exhaust the bigint range, especially for statistics gathered by pg_qualstats. We need to identify which counter is doing so. Can you execute each of those query on the repository server? At least one should error out:

WITH capture AS (
SELECT *
FROM powa_qualstats_src(9) q
WHERE EXISTS (SELECT 1
FROM powa_statements s
WHERE s.srvid = 9
AND q.queryid = s.queryid
AND q.dbid = s.dbid
AND q.userid = s.userid)
), test1 AS (
SELECT 9 AS srvid, qs.qualnodeid, qs.queryid, qs.dbid, qs.userid,
ts, sum(occurences),
avg(mean_err_estimate_ratio), avg(mean_err_estimate_num)
FROM capture as qs
GROUP BY srvid, ts, qualnodeid, qs.queryid, qs.dbid, qs.userid
)
SELECT count(*) FROM test1;
WITH capture AS (
SELECT *
FROM powa_qualstats_src(9) q
WHERE EXISTS (SELECT 1
FROM powa_statements s
WHERE s.srvid = 9
AND q.queryid = s.queryid
AND q.dbid = s.dbid
AND q.userid = s.userid)
), test2 AS (
SELECT 9 AS srvid, qs.qualnodeid, qs.queryid, qs.dbid, qs.userid,
ts, sum(execution_count),
avg(mean_err_estimate_ratio), avg(mean_err_estimate_num)
FROM capture as qs
GROUP BY srvid, ts, qualnodeid, qs.queryid, qs.dbid, qs.userid
)
SELECT count(*) FROM test2;
WITH capture AS (
SELECT *
FROM powa_qualstats_src(9) q
WHERE EXISTS (SELECT 1
FROM powa_statements s
WHERE s.srvid = 9
AND q.queryid = s.queryid
AND q.dbid = s.dbid
AND q.userid = s.userid)
), test3 AS (
SELECT 9 AS srvid, qs.qualnodeid, qs.queryid, qs.dbid, qs.userid,
ts, sum(nbfiltered),
avg(mean_err_estimate_ratio), avg(mean_err_estimate_num)
FROM capture as qs
GROUP BY srvid, ts, qualnodeid, qs.queryid, qs.dbid, qs.userid
)
SELECT count(*) FROM test3;
viki00 commented 10 months ago

Julien, I am not sure what to say, but I do not see the error any more. Ran the queries above, it returned 0.

Wondering if there was some incorrect data which rolled off.

Thank you so much. Vik

rjuju commented 10 months ago

Oh, I'm not sure how this could happen. The problematic rows are supposed to be stored in a local table and deleted only after being successfully processed. It also seems unlikely that the data was correct (assuming that the overflow came from sum(occurences), sum(execution_count) or sum(nbfiltered)), as it would mean that the remote server processed about 2^63 rows between two snapshots. Let me know if that problem happens again so I can investigate more.

Is everything now working correctly?

viki00 commented 9 months ago

Will do, thank you so much for your help. Apologize for the delay.

Everything is working perfecting. Vik

rjuju commented 9 months ago

No worries, I'm glad to hear that it's now working as expected!