TPC-Council / HammerDB

HammerDB Database Load Testing and Benchmarking Tool
http://www.hammerdb.com
GNU General Public License v3.0
583 stars 119 forks source link

Add Db2 libraries to Docker build #404

Open sm-shaw opened 2 years ago

sm-shaw commented 2 years ago

Libraries for all databases apart from Db2 can be automatically included in the Docker build.

db2lib

Currently, Db2 libraries need to be manually added to the odbc_cli directory. This issue is a placeholder to find a solution to including the Db2 libraries automatically in the Dockerbuild by either finding a suitable location from where they can be officially downloaded as per the other libraries or to include them with the correct permissions from IBM as redistributable in the odbc_cli directory.

sm-shaw commented 1 year ago

From consulting this page: https://www.ibm.com/support/pages/db2-odbc-cli-driver-download-and-installation-information a proposed solution is to put a copy of the IBM® Data Server Driver for ODBC and CLI on www.hammerdb.com to be fetched using curl/wget during the build. The SQL Server and Oracle libraries can already be fetched directly from Microsoft and Oracle locations directly.

The guidance at the link above says: You can include the driver in your database application installation package, and redistribute the driver with your applications. Under certain conditions, you can redistribute the driver with your database applications royalty-free.

This suggests that the file can be redistributed as described, however, these "certain conditions" do not appear to be detailed anywhere.

If we cannot get a resolution to this issue, another alternative is to remove Db2 from the Docker build and/or create a Db2 only image, where the user will need to do a manual driver download only for this image.

Input is requested on whether to move ahead with redistributing a copy as the licence suggests is permissible, or whether anyone knows what the "certain conditions" are and whether this would prevent moving ahead with the proposed solution.

memmertoIBM commented 1 year ago

I have asked the question again to the Db2 Licensing folks. It's truly bizarre why IBM providse a driver package that is not 100% redistributable. In the case of HammerDB, it's like they'd expect us to unpack the driver into HammerDB and only redistribute the subset that qualifies as "royalty-free".

sm-shaw commented 2 months ago

As we are currently in the process of updating Docker for v4.11 this question has arisen again, so updating additional information on this topic.

From here https://www.ibm.com/support/pages/download-initial-version-115-clients-and-drivers the available forms of the IBM client driver (that provide the functionality we are interested in) are:

  1. IBM Data Server Driver for ODBC and CLI (CLI Driver)
  2. IBM Data Server Runtime Client
  3. IBM Data Server Client

In particular this guide https://public.dhe.ibm.com/ps/products/db2/info/vr105/pdf/en_US/DB2InstallingClients-db2ite1050.pdf says that the IBM Data Server Driver for ODBC and CLI (CLI Driver) "has a small footprint and is designed to be redistributed by independent software vendors (ISVs). This driver is also designed to be used for application distribution in mass deployment scenarios that are typical of large enterprises."

This file also explains the restrictions that we were missing before:

Restrictions _Under the terms of the redistribution license, only some of the IBM Data Server Driver for ODBC and CLI files can be redistributed. Which files may be redistributed is listed in the file redist.txt. This file can be found in the compressed file that contains the driver, called ibm_data_server_driver_for_odbc_cli.zip on the Windows operating systems and ibm_data_server_driver_for_odbccli.tar.Z on all other platforms.

The file redist.txt referred to does not exist, however the file odbc_REDIST.txt does with the list of the following files:

04370923.cnv
08500923.cnv
08501252.cnv 
08600923.cnv
08630923.cnv
09230437.cnv
09230850.cnv
09230860.cnv
09231043.cnv
09231051.cnv
09231114.cnv
09231208.cnv
09231252.cnv
09231275.cnv
09241252.cnv
09370950.cnv
10430923.cnv
10510923.cnv
11140923.cnv
12080923.cnv
12520850.cnv
12520923.cnv
12750923.cnv
1388ucs2.cnv
1399ucs2.cnv
0939ucs2.cnv
0930ucs2.cnv
1390ucs2.cnv
ucs20943.cnv
0954ucs2.cnv
5039ucs2.cnv
0943ucs2.cnv
IBM00850.ucs
IBM00923.ucs
IBM01252.ucs
db2cli.ini.sample  
db2dsdriver.cfg.sample 
db2dsdriver.xsd
DigiCertGlobalRootCA.arm
db2dsdcfgfill  
db2ldcfg
db2lddrg
db2level
db2trc
db2drdat
db2cli
db2diag
db2support
db2admh.mo
db2adm.mo
db2caem.mo
db2supp.mo
db2cklog.mo
db2fodc.mo
db2stt.mo
db2clia1.lst
db2clias.lst
db2clih.mo
db2cli.mo
db2clit.mo
db2clp.mo
db2clp2.mo
db2diag.mo
db2sqlh.mo
db2sql.mo
IBMOSauthclient.so  
IBMOSauthclient.so.1
IBMIAMauth.so
IBMkrb5.so
libdb2clixml4c.so
libdb2clixml4c.so.1
libDB2xml4c.so
libDB2xml4c.so.58
libdb2.so
libdb2.so.1
libdb2o.so
libdb2o.so.1
sqlda.h
sqlcli1.h
sqlsystm.h
sqlca.h
sqlcli.h
sql.h
sqlenv.h
sqlunx.h
sqlstate.h
sqlext.h
sqlucode.h
sqltypes.h
db2cli.bnd
db2clipk.bnd
db2clist.bnd
db2ajgrt.bnd
db2spcdb.bnd
db2cli.lst
ddcsvm.lst
ddcsmvs.lst
ddcsvse.lst
ddcs400.lst
conlic.bin
odbc_notices.txt
odbc_REDIST.txt

Doing a diff of the provided files and the redist files the difference is the following:

Only in /opt/odbc_cli/clidriver: db2dump
Only in /opt/odbc_cli/clidriver/lib: icc
Only in /opt/odbc_cli/clidriver/lib: libDB2xml4c.so.58.0
Only in /opt/odbc_cli/clidriver/license: UNIX
Only in /opt/odbc_cli/clidriver: properties
Only in /opt/odbc_cli/clidriver: Readme.txt
Only in /opt/odbc_cli/clidriver/security32/plugin/IBM/client: IBMkrb5.so.1
Only in /opt/odbc_cli/clidriver/security64/plugin/IBM/client: IBMkrb5.so.1

So it appears that we could take the initial list of files, remove the ones above and then be able to provide a version of the Db2 client library for download from hammerdb.com.

The crucial client library file that HammerDB needs is libdb2.so.1 - and this is included in the list of redistributable files with the IBM Data Server Driver for ODBC and CLI (CLI Driver):

~/HammerDB-4.10/lib/db2tcl2.0.1$ ldd libdb2tcl.so.0.0.1
    ...
    libdb2.so.1 => /opt/db2test/redist/odbc_cli/clidriver/lib/libdb2.so.1 (0x00007f60d1064000)

However trying to load db2tcl from this client in HammerDB gives us a missing symbols error from the library included with the IBM Data Server Driver for ODBC and CLI (CLI Driver)

hammerdb>package require db2tcl
...
libdb2tcl.so: undefined symbol: sqlefrce_api
...

And checking the library confirms that this is missing:

$ objdump -TC libdb2.so.1 | grep sqlefrce_api
...

However, we do use this api in the schema delete functionality, so it is needed.

db2tclcmds.c:    sqlefrce(SQL_ALL_USERS, NULL, SQL_ASYNCH, &sqlca);

QL_API_RC SQL_API_FN                        /* Force Users                   */
  sqlefrce_api (
        sqlint32 NumAgentIds,                /* number of users to force      */
        sqluint32 * pAgentIds,               /* array of agent ids            */
        unsigned short ForceMode,            /* mode of operation             */
        struct sqlca * pSqlca);              /* SQLCA                         */

In contrast IBM Data Server Runtime Client and IBM Data Server Client do include this api and the libdb2.so.1 library is included in these clients in the tar.gz file BASE_CLIENT_11.5.4.0_linuxamd64_x86_64.tar.gz - so the library is different.

linuxamd64/FILES/lib64$ objdump -TC libdb2.so.1 | grep sqlefrce_api
0000000000e5ac60 g    DF .text  0000000000000257  Base        sqlefrce_api

However, it does not look like IBM Data Server Runtime Client and IBM Data Server Client are redistributable. Only the IBM Data Server Driver for ODBC and CLI (CLI Driver) is redistributable but does not contain the functionality we need.

So this remains unresolved, and we still cannot add the Db2 libraries to HammerDB Docker, meaning Db2 remains the only database where the user has to download and install either the IBM Data Server Runtime Client or IBM Data Server Client themselves.

This will remain the case until:

  1. IBM add the sqlefrce_api to the Server Driver for ODBC and CLI (CLI Driver).

  2. IBM advise that the version of libdb2.so.1 file from the BM Data Server Runtime Client or IBM Data Server Client can be redistributed (along with the IBM Data Server Driver for ODBC and CLI (CLI Driver))

Any additional authoritative answer from IBM is welcome on this topic as Db2 is the only database with this restriction and we cannot make it easier for Db2 to work 'out-of-the-box'.

sm-shaw commented 2 months ago

It also looks like the following libraries as well as libdb2.so.1 are needed by HammerDB and all of these are missing from the redistribution list:

libdb2dascmn.so.1 
libdb2g11n.so.1 
libdb2genreg.so.1 
libdb2install.so.1 
libdb2locale.so.1 
libdb2osse_db2.so.1 
libdb2sdbin.so.1 
libdb2trcapi.so.1 
libdb2osse.so.1
sm-shaw commented 2 months ago

We have tested a build with the 2 lists above of the redistributable files and the additional libraries, and this works OK for Db2 tests. Awaiting any response to advise if (or not) such a build can be used for our Docker builds so Db2 can work out-of-the-box.

memmertoIBM commented 2 months ago

I did some digging into the API / library issue:

So I see two options here:

memmertoIBM commented 2 months ago

On the redistribution front, it seems like repackaging is still the expectation.

Since the PDF you reference is for Db2 10.5 which is out of support, I dug around in our docs, and found similar wording in our current version (11.5), which does permit redistribution. While it doesn't explicitly reference it, it's safe to assume that redistribution must follow the rules in odbc_REDIST.txt in the tarball.

https://www.ibm.com/docs/en/db2/11.5?topic=overviews-data-server-clients

Installation is to simply extract the tarball:

https://www.ibm.com/docs/en/db2/11.5?topic=dsd-installing-data-server-driver-odbc-cli-software-linux-unix-operating-systems

Download link: Generic: https://www.ibm.com/support/pages/download-fix-packs-version-ibm-data-server-client-packages V115M9: https://www.ibm.com/support/pages/node/7071441

sm-shaw commented 2 months ago

Many thanks @memmertoIBM for taking the time to look into this for us. That is really appreciated and it looks like we now have a path forward.

So it is clear that we can redistribute the "IBM Data Server Driver for ODBC and CLI (CLI Driver)" for our Docker build if it is made up of only the files in odbc_REDIST.txt. Our plan would be to build this tar.gz and pull it from hammerdb.com when building a Docker image so then Db2 would work out-of-the-box.

However, for now the sqlefrce_api issues looks like a problem, we found we needed this before doing a drop, otherwise the drop would hang so the drop without it wouldn't work and we have the delete option in all the databases and all the example scripts.

Therefore, it looks like the best option is to wait for Db2 v12 (which it looks like is due in November?) to check if "IBM Data Server Driver for ODBC and CLI (CLI Driver)" then contains the sqlefrce_api.

If it doesn't, we may then have to then do a special build for Docker of the Db2 interface, where the schema delete functionality is replaced by a message to say it is not available with the "IBM Data Server Driver for ODBC and CLI (CLI Driver)" - although as the Docker build pulls the regular HammerDB binary we will need to work out how to distinguish between the separate builds.

memmertoIBM commented 2 months ago

I can temporarily remove the call to sqlefrce_api from db2tcl's Db2_force_off -- making it a no-op -- which should avoid the dependency. Once Db2 is enhanced, db2tcl can re-enable this functionality.

If all connections are dropped prior to dropping the schema, the force is a no-op anyway.

sm-shaw commented 2 months ago

That could be the best solution then. If db2tcl can compile against the odbc_REDIST.txt list as well as it includes the header files then that would make things a lot easier going forward.