wolfgangbrandl / db2_fdw

PostgreSQL DB2 Data Wrapper
Other
21 stars 13 forks source link

Foreign Data Wrapper for DB2

db2_fdw is a PostgreSQL extension that provides a Foreign Data Wrapper for easy and efficient access to DB2 databases, including pushdown of WHERE conditions and required columns as well as comprehensive EXPLAIN support.

This README contains the following sections:

  1. Cookbook
  2. Objects created by the extension
  3. Options
  4. Usage
  5. Installation Requirements
  6. Installation
  7. Internals
  8. Problems
  9. Support

db2_fdw was written by Wolfgang Brandl, with notable contributions from Laurenz Alba from Austria.

1 Cookbook

This is a simple example how to use db2_fdw. More detailed information will be provided in the sections Options and Usage. You should also read the

PostgreSQL documentation on foreign data

and the commands referenced there. A free distribution of DB2 can be found at:

IBM Db2 Express-C: Available at no charge

For the Installation of DB2 look at:

An overview of installing DB2 database servers

For the sake of this example, let's assume you can connect as operating system user postgres (or whoever starts the PostgreSQL server) with the following command:

db2 connect to SAMPLE

That means that the DB2 client and the environment is set up correctly. We also assume that the SAMPLE database provided in the DB2 package installation was built with:

db2sample

Please look at:

DB2 Verify Installation using command line processor

I also assume that db2_fdw has been compiled and installed (see the Installation section).

We want to access the tables defined in the SAMPLE database:

db2 describe table DB2INST1.EMPLOYEE

                                Data type                     Column
Column name                     schema    Data type name      Length     Scale Nulls
------------------------------- --------- ------------------- ---------- ----- ------
EMPNO                           SYSIBM    CHARACTER                    6     0 No
FIRSTNME                        SYSIBM    VARCHAR                     12     0 No
MIDINIT                         SYSIBM    CHARACTER                    1     0 Yes
LASTNAME                        SYSIBM    VARCHAR                     15     0 No
WORKDEPT                        SYSIBM    CHARACTER                    3     0 Yes
PHONENO                         SYSIBM    CHARACTER                    4     0 Yes
HIREDATE                        SYSIBM    DATE                         4     0 Yes
JOB                             SYSIBM    CHARACTER                    8     0 Yes
EDLEVEL                         SYSIBM    SMALLINT                     2     0 No
SEX                             SYSIBM    CHARACTER                    1     0 Yes
BIRTHDATE                       SYSIBM    DATE                         4     0 Yes
SALARY                          SYSIBM    DECIMAL                      9     2 Yes
BONUS                           SYSIBM    DECIMAL                      9     2 Yes
COMM                            SYSIBM    DECIMAL                      9     2 Yes

Then configure db2_fdw as PostgreSQL superuser like this:

pgdb=# CREATE EXTENSION db2_fdw;
pgdb=# CREATE SERVER sample FOREIGN DATA WRAPPER db2_fdw OPTIONS (dbserver 'SAMPLE');
pgdb=# GRANT USAGE ON FOREIGN SERVER sample TO pguser;

(You can use other naming methods or local connections, see the description of the option dbserver below.)

Then you can connect to PostgreSQL as pguser and define:

pgdb=> CREATE USER MAPPING FOR PUBLIC SERVER sample OPTIONS (user '', password '');

pgdb=> IMPORT FOREIGN SCHEMA "DB2INST1" FROM SERVER sample INTO public;

(Remember that table and schema name -- the latter is optional -- must normally be in uppercase.)

Now you can use the table like a regular PostgreSQL table.

2 Objects created by the extension

FUNCTION db2_fdw_handler() RETURNS fdw_handler
FUNCTION db2_fdw_validator(text[], oid) RETURNS void

These functions are the handler and the validator function necessary to create a foreign data wrapper.

FOREIGN DATA WRAPPER db2_fdw HANDLER db2_fdw_handler VALIDATOR db2_fdw_validator

The extension automatically creates a foreign data wrapper named db2_fdw. Normally that's all you need, and you can proceed to define foreign servers. You can create additional DB2 foreign data wrappers, for example if you need to set the nls_lang option (you can alter the existing db2_fdw wrapper, but all modifications will be lost after a dump/restore).

FUNCTION db2_close_connections() RETURNS void

This function can be used to close all open DB2 connections in this session. See the Usage section for further description.

FUNCTION db2_diag(name DEFAULT NULL) RETURNS text

This function is useful for diagnostic purposes only. It will return the versions of db2_fdw, PostgreSQL server and DB2 client. If called with no argument or NULL, it will additionally return the values of some environment variables used for establishing DB2 connections. If called with the name of a foreign server, it will additionally return the DB2 server version.

3 Options

Foreign data wrapper options

(Caution: If you modify the default foreign data wrapper db2_fdw, any changes will be lost upon dump/restore. Create a new foreign data wrapper if you want the options to be persistent. The SQL script shipped with the software contains a CREATE FOREIGN DATA WRAPPER statement you can use.)

Foreign server options

User mapping options

Foreign table options

Column options (from PostgreSQL 9.2 on)

4 Usage

DB2 permissions

The DB2 user will obviously need CONNECT privilege and the right to select from the table or view in question.

Connections

db2_fdw caches DB2 connections because it is expensive to create an DB2 session for each individual query. All connections are automatically closed when the PostgreSQL session ends.

The function DB2_close_connections() can be used to close all cached DB2 connections. This can be useful for long-running sessions that don't access foreign tables all the time and want to avoid blocking the resources needed by an open DB2 connection. You cannot call this function inside a transaction that modifies DB2 data.

Columns

When you define a foreign table, the columns of the DB2 table are mapped to the PostgreSQL columns in the order of their definition.

db2_fdw will only include those columns in the DB2 query that are actually needed by the PostgreSQL query.

The PostgreSQL table can have more or less columns than the DB2 table. If it has more columns, and these columns are used, you will receive a warning and NULL values will be returned.

If you want to UPDATE or DELETE, make sure that the key option is set on all columns that belong to the table's primary key. Failure to do so will result in errors.

Data types

You must define the PostgreSQL columns with data types that db2_fdw can translate (see the conversion table below). This restriction is only enforced if the column actually gets used, so you can define "dummy" columns for untranslatable data types as long as you don't access them (this trick only works with SELECT, not when modifying foreign data). If an DB2 value exceeds the size of the PostgreSQL column (e.g., the length of a varchar column or the maximal integer value), you will receive a runtime error.

These conversions are automatically handled by db2_fdw:

DB2 type                 | Possible PostgreSQL types
-------------------------+--------------------------------------------------
CHAR                     | char
VARCHAR                  | character varying
CLOB                     | text
VARGRAPHIC               | text
GRAPHIC                  | text
BLOB                     | bytea
SMALLINT                 | smallint
INTEGER                  | integer
BIGINT                   | bigint
DOUBLE                   | numeric,float
DATE                     | date
TIMESTAMP                | timestamp
TIME                     | time

This part is still under development. Restrictions will arise in further testing.

WHERE conditions and ORDER BY clauses

Joins between foreign tables

Modifying foreign data

EXPLAIN

For the explain the db2expln CLI command is called. Therefore the bin path of DB2_HOME has to be include into the PATH environment variable.

Support for IMPORT FOREIGN SCHEMA

From PostgreSQL 10.1 on, IMPORT FOREIGN SCHEMA is supported to bulk import table definitions for all tables in an DB2 schema. In addition to the documentation of IMPORT FOREIGN SCHEMA, consider the following:

Note that IMPORT FOREIGN SCHEMA does not work with DB2 server 8i; see the Problems section for details.

5 Installation Requirements

db2_fdw should compile and run on any platform supported by PostgreSQL and DB2 client, although I could only test it on Linux and Windows.

PostgreSQL 10.1 or better is required. Support for INSERT, UPDATE and DELETE is available from PostgreSQL 9.3 on.

DB2 client version 11.1 or better is required. db2_fdw can be built and used with DB2 Instant Client as well as with DB2 Client and Server installations installed with Universal Installer. Binaries compiled with DB2 Client 10 can be used with later client versions without recompilation or relink.

The supported DB2 server versions depend on the used client version (see the DB2 Client/Server Interoperability Matrix in support document 207303.1). For maximum coverage use DB2 Client 11.1, as this will allow you to connect to every server version from 8.1.7 to 12.1.0 except 9.0.1. PostgreSQL and DB2 need to have the same architecture, for example you cannot have 32-bit software for the one and 64-bit software for the other.

It is advisable to use the latest Patch Set on both DB2 client and server, particularly with desupported DB2 versions. For a list of DB2 bugs that are known to affect db2_fdw's usability, see the Problems section. Consult the db2_fdw Wiki (https://github.com/laurenz/db2_fdw/wiki) for tips about DB2 installation and configuration and share your own knowledge there.

DB2 Configuration

So that the DB2 Data Wraper can connect ot DB2 the necessary DB2 catalogs have to be created. DB2 needs at least a database catalog. If the postgres instance User is also the db2 instance than you have a local DB2 database. Execute:

db2 list database directory

If you get a database than try:

db2 connect to < database name>

If that works you can continue with the Installation and configuration.

If not, where is you DB2 database ? Remote or locally under an other user then you hostname is "localhost". If it is remote try if it is possible the hostname of the remote instance can be resolved by DNS like :

host <hostname>

If it cannot be resolved use the ip address as remote server name.

Find out on which port the DB2 Server is listening with:

db2 get dbm cfg |grep SVCENAME

If this is a number betwenn 1025 and 64000 then use this number if it is a name checkout the number in /etc/services for this name.

Then you can configure the node:

db2 catalog tcpip node <any nodename you want> remote localhost server <port>

After that you configure the database on the give nodename like:

db2 catalog database <db name> as <alias db name> at node <nodename you have defined before>

6 Installation

If you use a binary distribution of db2_fdw, skip to "Installing the extension" below.

Building db2_fdw:

db2_fdw has been written as a PostgreSQL extension and uses the Extension Building Infrastructure PGXS. It should be easy to install.

You will need PostgreSQL headers and PGXS installed (if your PostgreSQL was installed with packages, install the development package). You need to install DB2's C header files as well (SDK package for Instant Client). If you use the Instant Client ZIP files provided by DB2 and you are not on Windows, you will have to create a symbolic link from libclntsh.so to the actual shared library file yourself.

Make sure that PostgreSQL is configured --without-ldap (at least the server). See the Problems section.

Make sure that pg_config is in the PATH (test with pg_config --pgxs). Set the environment variable DB2_HOME to the location of the DB2 installation.

Unpack the source code of db2_fdw and change into the directory. Then the software installation should be as simple as:

$ make
$ make install

For the second step you need write permission on the directories where PostgreSQL is installed.

If you want to build db2_fdw in a source tree of PostgreSQL, use

$ make NO_PGXS=1

Installing the extension:

Make sure that the db2_fdw shared library is installed in the PostgreSQL library directory and that db2_fdw.control and the SQL files are in the PostgreSQL extension directory.

Since the DB2 client shared library is probably not in the standard library path, you have to make sure that the PostgreSQL server will be able to find it. How this is done varies from operating system to operating system; on Linux you can set LD_LIBRARY_PATH or use /etc/ld.so.conf.

Make sure that all necessary DB2 environment variables are set in the environment of the PostgreSQL server process (DB2_HOME if you don't use Instant Client, TNS_ADMIN if you have configuration files, etc.)

To install the extension in a database, connect as superuser and

CREATE EXTENSION db2_fdw;

That will define the required functions and create a foreign data wrapper.

To upgrade from an db2_fdw version before 1.0.0, use

ALTER EXTENSION db2_fdw UPDATE;

Note that the extension version as shown by the psql command \x or the system catalog pg_available_extensions is not the installed version of db2_fdw. To get the db2_fdw version, use the function DB2_diag.

Environment setup

It is mandatory that you correctly setup environment variables to use the extension.

DB2 uses a lot of environment variables, usually created by the

db2profile

script.

If you run PostgreSQL form a shell (via pg_ctl), ensure that the shell includes that script.

If you run PostgreSQL as a systemd unit, add the variables to the unit definition file (see #4)

If you use Ubuntu, please put the variables in

/etc/postgresql/XXX/main/environment

Running the regression tests:

Unless you are developing db2_fdw or want to test its functionality on an exotic platform, you don't have to do this.

For the regression tests to work, you must have a PostgreSQL cluster (10.1 or better) and an DB2 server (11.1 or better with Locator or Spatial) running, and the db2_fdw binaries must be installed. The regression tests will create a database called contrib_regression and run a number of tests.

The DB2 database must be prepared as follows:

The regression tests are run as follows:

$ make installcheck

7 Internals

db2_fdw sets the MODULE of the DB2 session to postgres and the ACTION to the backend process number. This can help identifying the DB2 session and allows you to trace it with DBMS_MONITOR.SERV_MOD_ACT_TRACE_ENABLE.

The isolation level is directly defined in the database. Per default the SAMPLE database is create with the isolation level 'currently commited'

To check the isolation level execute: db2 get db cfg for SAMPLE|grep CUR_COMMIT

If this is set to OFF the default is cursor stability.

8 Problems

There is a problem running the fdw in Windows. Up to now this fdw can only run if the system local in Windows is set to English(United States). There are problems with the representation of double,real and float with the '," sign. If the DB2 database is running Code Page 1252 then also the postgres db should be WIN1252. Up to now it is not possible to get the XML data type with the OCI db2 functions. Perhaps the odbc driver is more compatible for this feature.

9 Support

If you want to report a problem with db2_fdw, and the name of the foreign server is (for example) "sample", please include the output of

SELECT DB2_diag('sample');

in your problem report. If that causes an error, please also include the output of

SELECT DB2_diag();

If you have a problem or question or any kind of feedback, the preferred option is to open an issue on GitHub This requires a GitHub account.