powa-team / powa-web

PoWA user interface
http://powa.readthedocs.io/
74 stars 31 forks source link

powa-web UI error: function "powa_statements_snapshot" #198

Closed hrawulwa closed 4 months ago

hrawulwa commented 4 months ago

I have a remote setup, with both powa-collector and powa-web running on the same repository server. Versions: Powa archivisit_4_1_2, powa-collector-1.1.1 and powa-web-4.1.4 I registered a new database. However the UI is reporting this error: UARQA2: powa_take_snapshot(7): function "powa_statements_snapshot" failed: duplicate key value violates unique constraint "powa_statements_pkey" The snapshots are not happening because of this issue.

Thanks Hari

rjuju commented 4 months ago

hi,

as far as I can see a similar issue has been solved in powa-archivist 4.1.3, first item at https://github.com/powa-team/powa-archivist/releases/tag/REL_4_1_3

upgrading to this version should fix your issue.

hrawulwa commented 4 months ago

So, should I upgrade to powa-archivist 4.1.3 in DB server or repository server or both? Note that I'm not monitoring repository server, only DB servers are being monitored.

Thanks Hari

rjuju commented 4 months ago

the issue is fixed on the repository side, so it should be OK if you only update the extension on this server. but I really recommend to update it everywhere note that updating the extension doesn't require any downtime at all so it should be easy to do

hrawulwa commented 4 months ago

OK. I upgraded powa-archivist 4.1.3 in repository and DB server. The error goes away for one of the DB server. But the other DB server still throws the same error. Any clues? I restarted powa-collector and powa-web multiple times and reloaded the collector in the UI as well. No luck.

Thanks Hari

rjuju commented 4 months ago

Do you have more details about the setup? Is this other server a remote server with snapshot happening on the repository server where you just upgraded powa-archivist?

Can you show the full error log for the error that's still happening?

hrawulwa commented 4 months ago

Yes, this is just another remote server. The error went away on the first remote server, after upgrading powa-archivist on both repository server and remote servers. All I see is the below errors in powa collector log file. 2024-07-07 04:01:30,379 sl73ttpsdbq009:5432 WARNING: Number of errors during snapshot: 1 2024-07-07 04:01:30,379 sl73ttpsdbq009:5432 WARNING: Check the logs on the repository server

rjuju commented 4 months ago

I see, thanks for the confirmation. can you show the related messages in the repository server logs?

rjuju commented 4 months ago

it may show on the UI in the server configuration page too

hrawulwa commented 4 months ago

Above are the only errors on the repository log file. On the configuration page in UI see the below error TTSQ_Node1: powa_take_snapshot(9): function "powa_statements_snapshot" failed: duplicate key value violates unique constraint "powa_statements_pkey"

Thanks Hari

rjuju commented 4 months ago

I don't really have an answer here. The query that inserts those data returns deduplicated records and check that they don't exist in the table before doing so. The error in the logs should contain the problematic pk info, so you could check if it does exist in the powa_statements table or in the powa_statements_src_tmp for the given server (id 9 apparently). You could try to a VACUUM FULL of both table in case there's an index corruption.

hrawulwa commented 4 months ago

Interestingly, I see 518 records for combination of srvid, userid, dbid and queryid (PK) for the problematic server (srvid=9) in powa_statements_src_tmp table, and no records in powa_statements table. The other remote servers which does not have issue do not have any records in src_tmp table, but have in powa_statements_table. Btw, I did Vacuum full on both these tables without any success.

Thanks Hari

hrawulwa commented 4 months ago

I deleted the server using powa_delete_and_purge_server function. Restarted powa-collector, and re-registered the server this time with 7 days retention. The original was for 30 days. The problem went away and do not see any errors Not sure what fixed the problem.

Thanks Hari

rjuju commented 4 months ago

I'm not sure either what fixed it, but it's good it's now working.

I'm closing the issue, as usual feel free to reopen it or create a new one if needed.