credativ / informix_fdw

Foreign Data Wrapper for Informix Databases
Other
28 stars 5 forks source link

Speed? #7

Closed rossj-cargotel closed 9 years ago

rossj-cargotel commented 9 years ago

I think I've worked through all of the casting/blob issues but I'm seeing really slow result return times.

For example, to query a foreign table with 91,873 rows takes over 200,000 ms, doesn't seem to matter what I query for or whether there's an index on the informix side or not.

Here's the informix version of the alt_info table:

Column name          Type                                    Nulls

id                   serial                                  no
load_id              integer                                 yes
field_id             integer                                 yes
value_obs            varchar(40,0)                           yes
field_link           integer                                 yes
value                varchar(255)                            yes
join_at              varchar(16,0)                           yes

The postgres fdw version of this is:

cargotel=# \d+ metro_ft.alt_info
                                                             Foreign table "metro_ft.alt_info"
   Column   |          Type          |                           Modifiers                            | FDW Options | Storage  | Stats target | Description
------------+------------------------+----------------------------------------------------------------+-------------+----------+--------------+-------------
 id         | integer                | not null default nextval('metro_ft.alt_info_id_seq'::regclass) |             | plain    |              |
 load_id    | integer                |                                                                |             | plain    |              |
 field_id   | integer                |                                                                |             | plain    |              |
 value_obs  | character varying(40)  |                                                                |             | extended |              |
 field_link | integer                |                                                                |             | plain    |              |
 value      | character varying(255) |                                                                |             | extended |              |
 join_at    | character varying(16)  |                                                                |             | extended |              |
Server: cargotel_tcp
FDW Options: ("table" 'alt_info', database 'metro', client_locale 'en_US.utf8', db_locale 'en_US.819')

I set log_min_messages to debug5 and ran this:

cargotel=# \timing
Timing is on.
cargotel=# select id from metro_ft.alt_info;
WARNING:  opened informix connection with warnings
DETAIL:  informix SQLSTATE 01I01: "Database has transactions "
Time: 226552.606 ms

The debug5 logs seem to show an iteration over each row:

2015-06-26 19:26:20 UTC [982]: [15-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  informix_fdw: get foreign relation size, cmd 1
2015-06-26 19:26:20 UTC [982]: [16-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  ifx_fdw set param table=alt_info
2015-06-26 19:26:20 UTC [982]: [17-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  ifx_fdw set param database=metro
2015-06-26 19:26:20 UTC [982]: [18-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  ifx_fdw set param client_locale=en_US.utf8
2015-06-26 19:26:20 UTC [982]: [19-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  ifx_fdw set param db_locale=en_US.819
2015-06-26 19:26:20 UTC [982]: [20-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  ifx_fdw set param informixserver=cargonet
2015-06-26 19:26:20 UTC [982]: [21-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  ifx_fdw set param informixdir=/opt/IBM/informix
2015-06-26 19:26:20 UTC [982]: [22-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  ifx_fdw set param username=nobody
2015-06-26 19:26:20 UTC [982]: [23-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  ifx_fdw set param password=mailonly66
2015-06-26 19:26:20 UTC [982]: [24-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  informix connection dsn "metro@cargonet"
2015-06-26 19:26:20 UTC [982]: [25-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  created new cached informix connection "nobodymetrocargonet"
2015-06-26 19:26:20 UTC [982]: [26-1] user=postgres,db=cargotel,app=psql,client=[local] WARNING:  opened informix connection with warnings
2015-06-26 19:26:20 UTC [982]: [27-1] user=postgres,db=cargotel,app=psql,client=[local] DETAIL:  informix SQLSTATE 01I01: "Database has transactions "
2015-06-26 19:26:20 UTC [982]: [28-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  informix database connection using transactions
2015-06-26 19:26:20 UTC [982]: [29-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  informix_fdw: using rowid 1
2015-06-26 19:26:20 UTC [982]: [30-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  predicate for pushdown:
2015-06-26 19:26:20 UTC [982]: [31-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  prepare query "SELECT *, rowid FROM alt_info"
2015-06-26 19:26:20 UTC [982]: [32-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  declare cursor "nobodymetrocargonet_cur2_1"
2015-06-26 19:26:20 UTC [982]: [33-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  informix_fdw: get foreign paths
2015-06-26 19:26:20 UTC [982]: [34-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  informix_fdw: get foreign plan
2015-06-26 19:26:20 UTC [982]: [35-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  informix_fdw: begin scan
2015-06-26 19:26:20 UTC [982]: [36-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  ifx_fdw set param table=alt_info
2015-06-26 19:26:20 UTC [982]: [37-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  ifx_fdw set param database=metro
2015-06-26 19:26:20 UTC [982]: [38-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  ifx_fdw set param client_locale=en_US.utf8
2015-06-26 19:26:20 UTC [982]: [39-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  ifx_fdw set param db_locale=en_US.819
2015-06-26 19:26:20 UTC [982]: [40-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  ifx_fdw set param informixserver=cargonet
2015-06-26 19:26:20 UTC [982]: [41-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  ifx_fdw set param informixdir=/opt/IBM/informix
2015-06-26 19:26:20 UTC [982]: [42-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  ifx_fdw set param username=nobody
2015-06-26 19:26:20 UTC [982]: [43-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  ifx_fdw set param password=mailonly66
2015-06-26 19:26:20 UTC [982]: [44-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  mapped attnum PG/IFX 1 => 0
2015-06-26 19:26:20 UTC [982]: [45-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  mapped attnum PG/IFX 2 => 1
2015-06-26 19:26:20 UTC [982]: [46-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  mapped attnum PG/IFX 3 => 2
2015-06-26 19:26:20 UTC [982]: [47-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  mapped attnum PG/IFX 4 => 3
2015-06-26 19:26:20 UTC [982]: [48-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  mapped attnum PG/IFX 5 => 4
2015-06-26 19:26:20 UTC [982]: [49-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  mapped attnum PG/IFX 6 => 5
2015-06-26 19:26:20 UTC [982]: [50-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  mapped attnum PG/IFX 7 => 6
2015-06-26 19:26:20 UTC [982]: [51-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  populate descriptor area for statement "nobodymetrocargonet_stmt2_1"
2015-06-26 19:26:20 UTC [982]: [52-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  get descriptor column count 8
2015-06-26 19:26:20 UTC [982]: [53-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  open cursor "nobodymetrocargonet_cur2_1"
2015-06-26 19:26:20 UTC [982]: [54-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  informix_fdw: iterate scan
2015-06-26 19:26:20 UTC [982]: [55-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  ifx_fdw set param table=alt_info
2015-06-26 19:26:20 UTC [982]: [56-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  ifx_fdw set param database=metro
2015-06-26 19:26:20 UTC [982]: [57-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  ifx_fdw set param client_locale=en_US.utf8
2015-06-26 19:26:20 UTC [982]: [58-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  ifx_fdw set param db_locale=en_US.819
2015-06-26 19:26:20 UTC [982]: [59-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  ifx_fdw set param informixserver=cargonet
2015-06-26 19:26:20 UTC [982]: [60-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  ifx_fdw set param informixdir=/opt/IBM/informix
2015-06-26 19:26:20 UTC [982]: [61-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  ifx_fdw set param username=nobody
2015-06-26 19:26:20 UTC [982]: [62-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  ifx_fdw set param password=mailonly66
2015-06-26 19:26:20 UTC [982]: [63-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  get column pg/ifx mapped attnum 0/0
2015-06-26 19:26:20 UTC [982]: [64-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  get column pg/ifx mapped attnum 1/1
2015-06-26 19:26:20 UTC [982]: [65-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  get column pg/ifx mapped attnum 2/2
2015-06-26 19:26:20 UTC [982]: [66-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  get column pg/ifx mapped attnum 3/3
2015-06-26 19:26:20 UTC [982]: [67-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  get column pg/ifx mapped attnum 4/4
2015-06-26 19:26:20 UTC [982]: [68-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  get column pg/ifx mapped attnum 5/5
2015-06-26 19:26:20 UTC [982]: [69-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  get column pg/ifx mapped attnum 6/6
2015-06-26 19:26:20 UTC [982]: [70-1] user=postgres,db=cargotel,app=psql,client=[local] DEBUG:  informix_fdw: iterate scan

and so on, repeating.

I'm doing selects only at this point but need to do some fairly complex queries that I will map as views. I haven't tried disable_rowid because from the description it doesn't apply to selects. I also thought it might be related to the table option versus the query option but there doesn't seem to be a substantial difference between the two now that I have the casts/blobs worked out.

Any suggestions on how to speed this up?

Thanks, Bernd!

psoo commented 9 years ago

Hi,

not sure what you're after, do you mean a SELECT with a specific WHERE condition is still slow? If yes, could you please show the EXPLAIN output?

Please note that if you want to query all rows of a large table all the time, it might be better to have something materialized locally (e.g. a MATERIALIZED VIEW or similar). That's not how a foreign table works.

rossj-cargotel commented 9 years ago

Hi,

Yes, the speed of getting the result set is what I mean. Here's an explain analyze from the alt_info table above:

cargotel=# explain analyze verbose select * from metro_ft.alt_info where load_id > 1;
WARNING:  opened informix connection with warnings
DETAIL:  informix SQLSTATE 01I01: "Database has transactions "
                                                             QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------
 Foreign Scan on metro_ft.alt_info  (cost=2924.00..3553.33 rows=62933 width=680) (actual time=26.142..231429.761 rows=92084 loops=1)
   Output: id, load_id, field_id, value_obs, field_link, value, join_at
   Informix costs: 2924.00
   Informix query: SELECT *, rowid FROM alt_info WHERE   load_id > 1
 Planning time: 78.501 ms
 Execution time: 231483.801 ms
(6 rows)

Time: 231615.592 ms

There is an index on the load_id column on the informix side. I'm setting up iwatch on informix in hopes it can show me more what's happening. For sure that query in informix doesn't take 4 minutes.

I'd looked at and discarded using materialized views because of the cost of updating them but it might be possible to put triggers on the informix side and save changes in another table that could be unioned with the materialized views. Seems a little wonky though.

Thank you, Bernd, for your help and hard work on this FDW!

psoo commented 9 years ago

Okay, the predicate gets pushed down, however, there are still many rows to transmit according to the EXPLAIN. It's pretty likely that the index isn't used on the Informix side.

Currently, the informix FDW fetches a row for each call to ifxIterateForeignScan(), so one row at a time the PostgreSQL Executor performs the foreign scan. So if you have large results sets, network latency and all additional overhead applies to this single FETCH for each row.

I've thought about using fetch arrays in the past, but disregarded them since they seem to have a benefit with large objects only, see

http://www-01.ibm.com/support/knowledgecenter/SSGU8G_11.50.0/com.ibm.esqlc.doc/sii15750078.htm%23sii15750078

for details. It's also possible to do prefetching on our own, but i'm not sure how this works together with DML commands. This would require much more thinking, i believe.

I've tested the performance with a remote Informix server running on a VM in the past, and i have to say that i get much better results:

bernd@localhost:bernd #= EXPLAIN ANALYZE SELECT * FROM osm_roads WHERE id > 1;
                                                        QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
 Foreign Scan on osm_roads  (cost=2925.00..3481.13 rows=55613 width=644) (actual time=1.273..2683.446 rows=55613 loops=1)
   Informix costs: 2925.00
   Informix query: SELECT *, rowid FROM osm_roads WHERE   id > 1
 Planning time: 1.910 ms
 Execution time: 2689.411 ms

The table is smaller than yours, however, it looks like you have to deal with heavy roundtrips somewhere. So something that would save you transmitting the large datasets all the time would be certainly an option in your case, like the materialized view mentioned earlier. This way, you'd have to spent only during the refresh the time it needs to transfer all your data...

rossj-cargotel commented 9 years ago

We are moving to a server in the same hosting environment as the informix database so hopefully that will eliminate the roundtrip delay.

Closing this for now.