credativ / informix_fdw

Foreign Data Wrapper for Informix Databases
Other
28 stars 5 forks source link

segfault in libifsql.so #5

Closed rossj-cargotel closed 9 years ago

rossj-cargotel commented 9 years ago

In testing, I'm getting the following segfault when attempting a simple select from a foreign table.

Jun 9 17:32:03 ec2 kernel: [589818.846658] postgres[29482]: segfault at 0 ip 00007fdfaa95417b sp 00007fff9144f350 error 4 in libifsql.so[7fdfaa941000+52000] Jun 9 17:34:07 ec2 kernel: [589943.022806] postgres[29542]: segfault at 0 ip 00007fdfaa9d517b sp 00007fff9144f350 error 4 in libifsql.so[7fdfaa9c2000+52000]

The select immediately crashes postgres with the following message: psql (9.4.2) Type "help" for help.

dbscs=# select * from acct_cache; server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. !> \q

I have an strace of the event if that helps.

Jeff

psoo commented 9 years ago

Hmm, that sounds strange.

A backtrace from the crashing backend would be nice, see https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD#Getting_a_trace_from_a_running_backend for how to get this.

Also, if possible, the table definition would be fine. Which CSDK version and OS are you using exactly?

rossj-cargotel commented 9 years ago

Okay, finally got gdb working--thanks for the link!

OS: Ubuntu 14.04 LTS CSDK: /opt/IBM/informix/bin/check_version csdk Currently installed version: 4.10.FC5DE Previous latest version: 4.10.FC5 You have installed a newer version of ClientSDK over an older version

Table Definition: create foreign table acct_cache ( id serial, type_id int , prefix text , data text , amount numeric , timestamp timestamp without time zone , check_number text , client_number text , check_date date ) server cargotel_tcp options ( query 'select * from acct_cache', database 'dbscs', informixdir '/opt/IBM/informix', informixserver 'cargodrda', client_locale 'en_US.utf8' );

Backtrace: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7ffff7fde780 (LWP 8624)] 0x00007fffe8e0c17b in proc_srvrresp () from /opt/IBM/informix/lib/esql/libifsql.so

(gdb) bt

0 0x00007fffe8e0c17b in proc_srvrresp () from /opt/IBM/informix/lib/esql/libifsql.so

1 0x00007fffe8e0e0d5 in asf_connect () from /opt/IBM/informix/lib/esql/libifsql.so

2 0x00007fffe8e0f12a in sqli_connect_open () from /opt/IBM/informix/lib/esql/libifsql.so

3 0x00007fffe90527b1 in ifxCreateConnectionXact (coninfo=0x555555e13ec0) at ifx_connection.ec:87

4 0x00007fffe9059e01 in ifxSetupConnection (coninfo=coninfo@entry=0x7fffffffd878, foreignTableOid=foreignTableOid@entry=65576, mode=mode@entry=IFX_PLAN_SCAN, error_ok=error_ok@entry=1 '\001') at ifx_fdw.c:2328

5 0x00007fffe905a1b5 in ifxSetupFdwScan (coninfo=coninfo@entry=0x7fffffffd878, state=state@entry=0x7fffffffd888, plan_values=plan_values@entry=0x7fffffffd880, foreignTableOid=foreignTableOid@entry=65576, mode=mode@entry=IFX_PLAN_SCAN) at ifx_fdw.c:2460

6 0x00007fffe905b845 in ifxGetForeignRelSize (planInfo=0x555555dd8408, baserel=0x555555dd78f0, foreignTableId=65576) at ifx_fdw.c:3022

7 0x000055555577b3b1 in set_foreign_size (rte=0x555555dd7b10, rel=0x555555dd78f0, root=0x555555dd8408) at /tmp/buildd/postgresql-9.4-9.4.3/build/../src/backend/optimizer/path/allpaths.c:416

8 set_rel_size (root=root@entry=0x555555dd8408, rel=0x555555dd78f0, rti=rti@entry=1, rte=0x555555dd7b10) at /tmp/buildd/postgresql-9.4-9.4.3/build/../src/backend/optimizer/path/allpaths.c:259

9 0x000055555577bfd7 in set_base_rel_sizes (root=) at /tmp/buildd/postgresql-9.4-9.4.3/build/../src/backend/optimizer/path/allpaths.c:189

10 make_one_rel (root=0x555555dd8408, joinlist=0x555555e137e8) at /tmp/buildd/postgresql-9.4-9.4.3/build/../src/backend/optimizer/path/allpaths.c:147

11 0x000055555579443e in query_planner (root=root@entry=0x555555dd8408, tlist=tlist@entry=0x555555e12f80, qp_callback=qp_callback@entry=0x555555794c90 , qp_extra=qp_extra@entry=0x7fffffffdb00) at /tmp/buildd/postgresql-9.4-9.4.3/build/../src/backend/optimizer/plan/planmain.c:236

12 0x0000555555795add in grouping_planner (root=root@entry=0x555555dd8408, tuple_fraction=tuple_fraction@entry=0) at /tmp/buildd/postgresql-9.4-9.4.3/build/../src/backend/optimizer/plan/planner.c:1287

13 0x0000555555798044 in subquery_planner (glob=glob@entry=0x555555dd7d18, parse=parse@entry=0x555555dd7a00, parent_root=parent_root@entry=0x0, hasRecursion=hasRecursion@entry=0 '\000', tuple_fraction=0, subroot=subroot@entry=0x7fffffffdca8) at /tmp/buildd/postgresql-9.4-9.4.3/build/../src/backend/optimizer/plan/planner.c:572

14 0x0000555555798401 in standard_planner (parse=0x555555dd7a00, cursorOptions=0, boundParams=0x0) at /tmp/buildd/postgresql-9.4-9.4.3/build/../src/backend/optimizer/plan/planner.c:210

15 0x0000555555815494 in pg_plan_query (querytree=, cursorOptions=cursorOptions@entry=0, boundParams=boundParams@entry=0x0) at /tmp/buildd/postgresql-9.4-9.4.3/build/../src/backend/tcop/postgres.c:777

16 0x0000555555815584 in pg_plan_queries (querytrees=, cursorOptions=cursorOptions@entry=0, boundParams=boundParams@entry=0x0) at /tmp/buildd/postgresql-9.4-9.4.3/build/../src/backend/tcop/postgres.c:836

17 0x00005555558177ef in exec_simple_query (query_string=0x555555dd6bd0 "select * from acct_cache;") at /tmp/buildd/postgresql-9.4-9.4.3/build/../src/backend/tcop/postgres.c:1001

18 PostgresMain (argc=, argv=argv@entry=0x555555d502a8, dbname=0x555555d50160 "dbscs", username=) at /tmp/buildd/postgresql-9.4-9.4.3/build/../src/backend/tcop/postgres.c:4074

19 0x00005555555e902d in BackendRun (port=0x555555d8de30) at /tmp/buildd/postgresql-9.4-9.4.3/build/../src/backend/postmaster/postmaster.c:4164

20 BackendStartup (port=0x555555d8de30) at /tmp/buildd/postgresql-9.4-9.4.3/build/../src/backend/postmaster/postmaster.c:3829

21 ServerLoop () at /tmp/buildd/postgresql-9.4-9.4.3/build/../src/backend/postmaster/postmaster.c:1597

22 0x00005555557c2ae1 in PostmasterMain (argc=5, argv=) at /tmp/buildd/postgresql-9.4-9.4.3/build/../src/backend/postmaster/postmaster.c:1244

23 0x00005555555e9d33 in main (argc=5, argv=0x555555d4f320) at /tmp/buildd/postgresql-9.4-9.4.3/build/../src/backend/main/main.c:228

psoo commented 9 years ago

Okay, it crashes during connection establishing in ifxCreateConnectionXact() when calling this:

EXEC SQL CONNECT TO :ifxdsn AS :ifxconname
    USER :ifxuser USING :ifxpass WITH CONCURRENT TRANSACTION;

Note that this is ESQL/C, so it wraps this preprocessor statement to sqli_connect_open(). Could you please also post your CREATE SERVER and CREATE USER MAPPING statements and retry selecting the foreign table with

SET client_min_messages TO DEBUG5;

enabled before? That will print various internal settings from the Informix FDW.

psoo commented 9 years ago

Jeff,

i was able to reproduce this crash by leaving out the password parameter to the connection. The reason the backend crashes is that the password option is passed uninitialized to the Informix CSDK API, which causes finally the segmentation fault. I'm surprised no one had discovered this problem before (myself included) :/

I've pushed a fix in

https://github.com/credativ/informix_fdw/commit/a843eaf7e877dd98dad65c7968a2f5452655cbca

Please note that username and password are considered mandatory within the Informix FDW API to establish a database connection.

Please test and let me know if this works now.

rossj-cargotel commented 9 years ago

Hi Bernd,

Here's the information you requested above. I'll build the new version and test it but I have been passing the correct username and password.

drop server if exists cargotel_tcp cascade;

create server cargotel_tcp foreign data wrapper informix_fdw options (informixserver 'cargodrda');

drop user mapping if exists for current_user server cargotel_tcp;

create user mapping for current_user server cargotel_tcp options (username 'nobody',password 'railonly55');

drop foreign table if exists acct_cache;

create foreign table acct_cache ( id serial, type_id int , prefix text , data text , amount numeric , timestamp timestamp without time zone , check_number text , client_number text , check_date date ) server cargotel_tcp options ( query 'select * from acct_cache', database 'dbscs', informixdir '/opt/IBM/informix', informixserver 'cargodrda', client_locale 'en_US.utf8' );

postgres@ip-172-30-0-81:~$ psql dbscs psql (9.4.2, server 9.4.3) Type "help" for help.

dbscs=# SET client_min_messages TO DEBUG5; DEBUG: CommitTransactionCommand DEBUG: CommitTransaction DEBUG: name: unnamed; blockState: STARTED; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children: LOG: duration: 0.146 ms statement: SET client_min_messages TO DEBUG5; SET dbscs=# select * from acct_cache; DEBUG: StartTransactionCommand DEBUG: StartTransaction DEBUG: name: unnamed; blockState: DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children: DEBUG: informix_fdw: get foreign relation size, cmd 1 DEBUG: ifx_fdw set param query=select * from acct_cache DEBUG: ifx_fdw set param database=dbscs DEBUG: ifx_fdw set param informixdir=/opt/IBM/informix DEBUG: ifx_fdw set param informixserver=cargodrda DEBUG: ifx_fdw set param client_locale=en_US.utf8 DEBUG: ifx_fdw set param informixserver=cargodrda DEBUG: ifx_fdw set param username=nobody DEBUG: ifx_fdw set param password=railonly55 DEBUG: informix connection dsn "dbscs@cargodrda" server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. !> \q

psoo commented 9 years ago

Hrm, so it still crashes with correct credentials? What's going on there...

rossj-cargotel commented 9 years ago

Yes, I just tested the new version and it still crashes. The backtrace is identical to the one posted above.

psoo commented 9 years ago

Hmm, i've setup a Ubuntu VM but with CSDK 4.10FC4 (Looks like IBM doesn't provide free downloads for FC5 anymore??). If i retry your above crashing example i get the following:

CREATE SERVER centosifx FOREIGN DATA WRAPPER informix_fdw OPTIONS(informixserver 'ol_informix1210');

CREATE USER MAPPING FOR CURRENT_USER SERVER centosifx OPTIONS(username 'informix', password 'informix');

CREATE FOREIGN TABLE inttest (f1 bigint, f2 integer, f3 smallint) SERVER centosifx OPTIONS(query 'select * from inttest', database 'regression', informixdir '/opt/IBM/informix', informixserver 'ol_informix1210', client_locale 'en_US.utf8');

SELECT * FROM inttest; DEBUG: informix connection dsn "regression@ol_informix1210" ERROR: XX000: could not open connection to informix server: SQLCODE=-23197 LOCATION: ifxSetupConnection, ifx_fdw.c:2371 STATEMENT: SELECT * FROM inttest; ERROR: could not open connection to informix server: SQLCODE=-23197

Note that you specify informixserver twice, which is redundant. Good practice is to specify INFORMIXDIR and INFORMIXSERVER within the SERVER specification.

SQLCODE 23197 means:

finderr 23197 -23197 Database locale information mismatch.

The locale information GL_CTYPE or GL_COLLATE in the system catalog of the specified database does not match the locale information in the specified environment variable DB_LOCALE. Check the value of DB_LOCALE.

So indeed, i'm missing the DB_LOCALE setting, adding it to the foreign table gives (take care for the correct locale in your instance):

ALTER FOREIGN TABLE inttest OPTIONS(ADD db_locale 'en_US.819');

SELECT * FROM inttest LIMIT 10; f1 | f2 | f3 ------+-----+---- 100 | 200 | 20 -199 | 120 | 1 -198 | 120 | 2 -197 | 120 | 3 -196 | 120 | 4 -195 | 120 | 5 -194 | 120 | 6 -193 | 120 | 7 -192 | 120 | 8 -191 | 120 | 9 (10 rows)

This doesn't explain why this still crashes at your site, though, but could you repeat testing with a correct db_locale setting?

rossj-cargotel commented 9 years ago

Very interesting!

After verifying the correct server locale with onstat -g env I altered my create script as so:

drop server if exists cargotel_tcp cascade;

create server cargotel_tcp foreign data wrapper informix_fdw options (informixserver 'cargodrda', informixdir '/opt/IBM/informix');

drop user mapping if exists for current_user server cargotel_tcp;

create user mapping for current_user server cargotel_tcp options (username 'nobody',password 'mailonly66');

drop foreign table if exists acct_cache;

create foreign table acct_cache ( id serial, type_id int , prefix text , data text , amount numeric , timestamp timestamp without time zone , check_number text , client_number text , check_date date ) server cargotel_tcp options ( query 'select * from acct_cache', database 'dbscs', client_locale 'en_US.utf8', db_locale 'en_US.819' );

But, no change--the crash and backtrace are the same.

I don't know about FC5--this is my first experience with informix--but I downloaded the latest CSDK that I could find. I can check with our informix folks to see if there is a newer one for 11.10FC2.

rossj-cargotel commented 9 years ago

One thing I just thought of that might be relevant--this is a remote informix server to which I need to connect over the DRDA protocol. I can telnet to the DRDA port from the EC2 server and get a connection though.

psoo commented 9 years ago

Jeff,

thanks for the hint, i'm going to investigate that. Stay tuned.

psoo commented 9 years ago

Jeff,

FYI, i'm able to reproduce your issue now. However, i'm under the impression that this must have something to do with the configuration. dbaccess for example crashes in my test environment as well (SIGSEGV). So there must be something else happening :/ Will dig further...

rossj-cargotel commented 9 years ago

Hi Bernd,

I thought I'd revisit the whole drda protocol issue since I'm connecting to informix over an openvpn tunnel.

I added a new server to my sqlhosts file for the onsoctcp protocol and now I can connect to informix with the informix_fdw! So far I'm unable to retrieve any rows but will continue working on that.

I do get the following message when I do a select from a foreign table:

WARNING: opened informix connection with warnings DETAIL: informix SQLSTATE 01I01: "Database has transactions " id | name | email | fax | user_parent_id ----+------+-------+-----+---------------- (0 rows)

Google is pointing me to the following PDF

  IBM Informix ESQL/C Programmer™s Guide - CURSOR IBM ...
  <http://www.cursor-distribution.de/aktuell.11.70.xC8/documentation/ids_esqlc_bookmap.pdf>

but although it mentions the warning I don't see if it is relevant.

The address book table on the informix server has 35 rows, and I can retrieve them with dbaccess on the ec2 server.

Jeff

Jeff Ross rossj@cargotel.com SENDPM

On 6/15/15 10:16 AM, Bernd Helmle wrote:

Jeff,

FYI, i'm able to reproduce your issue now. However, i'm under the impression that this must have something to do with the configuration. dbaccess for example crashes in my test environment as well (SIGSEGV). So there must be something else happening :/ Will dig further...

— Reply to this email directly or view it on GitHub https://github.com/credativ/informix_fdw/issues/5#issuecomment-112124646.

The contents of this e-mail and any attachments are intended solely for the use of the named addressee(s) and may contain confidential and/or privileged information. Any unauthorized use, copying, disclosure, or distribution of the contents of this e-mail is strictly prohibited by the sender and may be unlawful. If you are not the intended recipient, please notify the sender immediately and delete this e-mail.

rossj-cargotel commented 9 years ago

Turning on debug5 I see that warning is just telling me that the informix db has transactions.

I found the pasto-error in my foreign table creation script--the informix_fdw is now working completely!

On to testing!

Thank you, Bernd,