Closed lazystone closed 5 years ago
Would you happen to have a log ? At least of the last executed query ? Thank you.
2015-06-04 03:02:58 CEST LOG: server process (PID 24488) was terminated by signal 11: Segmentation fault 2015-06-04 03:02:58 CEST DETAIL: Failed process was running: copy (select * from service_passenger_departure_view) to '/tmp/extract_and_build_database/service_passenger_departure.dmp'
What is the definition of the view, and the underlying tables column types, for column involved in a JOIN or WHERE clause ?
Does it crash every time you run this query, or only during your long transaction ?
Thank you.
I've substituted field names by their types:
CREATE OR REPLACE VIEW service_passenger_departure_view AS
SELECT DISTINCT psd.integer_column,
pbd.integer_column,
st.character_varying_2,
st.character_varying_64,
s.character_varying_10,
s.character_varying_256,
st.integer_column,
s.character_varying_40,
ps.boolean_column,
ps.character_varying_3,
ps.timestamp_with_time_zone,
ps.integer_column,
ps.timestamp_with_time_zone
FROM st, s, pbd, psd, ps
WHERE ps.integer_column IS NOT NULL AND st.inter_column = (s.inter_column + 0) AND s.integer_column = (ps.integer_column + 0) AND pbd.integer_column = (ps.integer_column + 0) AND psd.integer_column = (ps.integer_column + 0);
I guess that it fails every time. But can't say for sure right now.
Tried to enable it once more - fails on another query now:
16:02:30 CEST LOG: server process (PID 49306) was terminated by signal 11: Segmentation fault
2015-06-04 16:02:30 CEST DETAIL: Failed process was running: /* load collection se.paxport.paxoutput.core.model.DocumentEntity.documentReceivers */ select documentre0_.document_id as document1_2_0_, documentre0_.address as address2_5_0_, documentre0_.as_attachment as as_attac3_5_0_, documentre0_.carbon_copy as carbon_c4_5_0_, documentre0_.name as name5_5_0_, documentre0_.receive_only_reportable_data as receive_6_5_0_, documentre0_.receiver_id as receiver7_5_0_, documentre0_.sent_at as sent_at8_5_0_, documentre0_.split_mail_pnl as split_ma9_5_0_, documentre0_.type as type10_5_0_, documentre0_.use_adl_all_group as use_adl11_5_0_ from t_document_receiver documentre0_ where documentre0_.document_id=$1
I tried to reproduce your issue, but without success. Could you provide a self-contained test or a dump of the required data ? (you can send it at powa (at) dalibo (dot) com if you need privacy).
Can you also specify the exact postgresql and pg_qualstats versions ?
Regards.
I will try to enable this extension again next week. The problem is that it's reproducible on production server only. Postgresql version was 9.4.3 at the moment of failure, now it's the latest one - 9.4.4. pg_qualstats seems to be version 0.0.6
Instead of enabling it on the production server, could you send us the whole database schema, as well as logs from the instance when it crashed ?
What are your configuration options for pg_qualstats ?
Hello,
Do you have any update on this issue ?
Hello, No. I haven't gotten a chance to test it again. I guess you can close the issue. If something comes up, then I'll create another one.
Ok, thanks for the answer!
Just coming here to add a "me too."
I'm getting segmentation faults using PoWA. I havent managed to isolate it yet specifically to pg_qualstats yet.
powa-archivist 3.0.1 pg_track_settings 1.0.0 pg_stat_kcache at git d7734fb8ce60f8d8ce3fa9efae627c9c730c9488 pg_qualstats at git 1ab0e68398376d8518b4cd6bfa4eae418762a7d2
Additionally - sadly I can not provide the database schema or data.
You should be able to see in postgrsql logs the query that triggered this segfault. Can you share it?
Give me some time to spin up a nonprod db.
Do you have any preference for Postgres log config to aid this debugging? I have most logging off.
On Thursday, February 11, 2016, Julien Rouhaud notifications@github.com wrote:
Reopened #52 https://github.com/dalibo/powa/issues/52.
— Reply to this email directly or view it on GitHub https://github.com/dalibo/powa/issues/52#event-546842290.
Raymond "VisualPhoenix" Barbiero
The only relevant info than you'll have in the log is the query that caused the segmentation fault. We only need this and a definition of the relations involved (and of course ideally sample data to reproduce this issue).
If you need privacy, you can send any data to powa(AT)dalibo.com.
I know this is an old thread, but I just installed powa on Enterprise DB 9.6 and I am having the same issue. I have narrowed it down to when I am joining to a particular foreign table.
2018-05-03 09:43:37 EDT LOG: server process (PID 9102) was terminated by signal 11: Segmentation fault 2018-05-03 09:43:37 EDT DETAIL: Failed process was running: select ...
I tried recreating the table as a local table and querying against that and it works fine, just not as a foreign table. Are there any limitations on foreign tables with powa?
@mbroxson which version of pg_qualstats are you using? I'm not aware of any issue with foreign tables. A quick local test doesn't cause any segfault using a foreign table. Can you provide a way to reproduce it?
I'm using pg_qualstats 1.0.4. I'm trying to get a way to reproduce it, but it only seems to be happening with this one foreign table so trying to replicate is proving difficult. I'll post something as soon as I can reproduce reliably
that's indeed strange. Maybe setting pg_qualstats.sample_rate = 1 could help to reproduce?
finally got a schema setup that I can send you to reproduce. I have two databases set up on the same server and used a fdw to link the two.
First database can be anything, run powa_test.sql and it will load a powa_test schema with tables that match the layout of our database. Second database I called powa_test_foreign and set a postgres_fdw server up as local_powa_test_foreign, then you can run the powa_test_foreign.sql
The query that I am trying to run is in powa_test_query.sql and fails even without data in these tables.
Thanks for the test case, I can reproduce the issue. The problematic query can be reduced to
select 1
from powa_test.sdr t
join powa_test.p on p.int_id = t.int_1
join powa_test.iui on iui.varchar_1 = p.varchar_17;
I'm looking at it.
I'm very I took me so long to come back to this. I just pushed https://github.com/powa-team/pg_qualstats/commit/6fb70ee68bd8bab98a102927b10c29f3a46e06e5 which should fix the issue.
Thanks a lot to @lazystone @visualphoenix and @mbroxson to report the issue and @mbroxson for the reproducer. It was super helpful.
Sorry again for the time needed to fix this.
[628594.420330] postgres[24488]: segfault at 8 ip 00007f295308a03c sp 00007ffef54b16a0 error 4 in pg_qualstats.so[7f2953088000+5000]
The only hint - it fails on a very long transaction with a lot of inserts.