FirebirdSQL / jaybird

JDBC driver for Firebird
https://firebirdsql.org/en/jdbc-driver/
GNU Lesser General Public License v2.1
94 stars 23 forks source link

Poor performance on routed networks (with potentionally higher latency) - probably jaybird/firebird related? #694

Closed hubeny closed 2 years ago

hubeny commented 2 years ago

Hi,

Problem description

I created a simple single-purpose datapump application, which reads data from Oracle Express database and writes data to Firebird database. The datapump copies hundreds of thousands of records, there are no blobs in tables just short varchars and integers. We have several VLANs in our corporate network. We currently have database servers in server VLAN and client computers in client VLAN. All traffic between this two VLANs is routed. The datapump task, which I describe here, runs on a client PC and connects to Oracle DB (for reading) and to Firebird DB (write) on two different servers, but both are in the server VLAN. I tried to place databases and datapump application on different servers and positions within VLANs and found that the running time of the application strongly depends on where the application and firebird database is located and whether traffic between application and firebird database passes through the router (packets must be routed). Further, the oracle JDBC driver uses the full size of the TCP IP packet when reading data, but jaybird/firebird driver did not when writing to database (it will be shown later).

The question of this issue is: Can I somehow force Jaybird/Firebird to use the full size of the network packet (1500 MTU) when making lot of inserts to database in order to have smaller number of packets (which will be routed on network with higher latency)?

Detailed problem description

We use the Jaybird library in the latest version 4.0.6, firebird in the latest version 3.0.10. Here are the results. I measure runtime of the datapump in three scenarios:

1) Everything runs locally on same computer (oracle db, firebird db and datapump): Runtime: Firebird related: 51.232418 seconds Oracle related: 7.991233 seconds

3) Oracle remotely, firebird remotely, datapump is on the same VLAN as the oracle and firebird database Runtime: Firebird related: 215.621674 seconds Oracle related: 22.818972 seconds

4) Oracle remotely, firebird remotely, datapump is on the different VLAN than the oracle and firebird database Runtime: Firebird related: 1227,511597 seconds Oracle related: 111,871330 seconds

On both databases are tables with same names and with columns of same type. The code of the application can be simplified to following (not complete code, just for idea):

public void pumpData() {
    Connection firebirdConn = null;
    Connection oracleConn = null;
    try {
        Connection firebirdConn = createFBConn();
        Connection oracleConn = createORConn();
         for (String tableName : tableNames) {
            copyTableName(tableName, oracleConn, firebirdConn);
        } 
        firebirdConn.close();
        oracleConn.close();
    } catch (SQLException e) {
        //here is code for rollback...
    } finally {
         //close potentionally unclosed connections
    }
}

private void copyTableName (String tblName, Connection orConn, Connection fbConn) {

        String selectSql = String.format( "select  * from %s", tblName);
        Statement selectStmt = null;
        ResultSet rs = null;
        PreparedStatement insertStmt = null;
        int i = 0;
        try {
            selectStmt = orConn.createStatement();
            rs = selectStmt.executeQuery(selectSql);
            ResultSetMetaData mtdt = rs.getMetaData();
            Map<String, Tuple> fields = getFields(isConn, mtdt, tblName);
            String insertSql = String.format("insert into %s (%s)\nvalues\n(%s) ", tblName, getFieldsString(fields, mtdt, false), etFieldsString(fields, mtdt, true));
            insertStmt = isConn.prepareStatement(insertSql);

            while (rs.next()) {
                setInsertParams(insertStmt, fields, rs, loadID);
                insertStmt.execute();
//                insertStmt.addBatch();
                i++;
            }
//            insertStmt.executeBatch();
        } finally {
           //here close resultSet and statements.....
        }
}

I found out that the main reason that the write to firebird database is that firebird protocol under creates too much small packets. See listing bellow (see the end of the lines of TCP dump):

READ from oracle
root@nb101:~# tcpdump -i enx9cebe8f0b9d6  net 172.25.32.77/32
14:50:41.572689 IP lpp.soma.cz.1521 > nb101.soma.cz.59436: Flags [P.], seq 83233:84553, ack 1038, win 410, options [nop,nop,TS val 2360649196 ecr 260542522], length 1320
14:50:41.592649 IP nb101.soma.cz.59436 > lpp.soma.cz.1521: Flags [P.], seq 1038:1055, ack 84553, win 501, options [nop,nop,TS val 260542544 ecr 2360649196], length 17
14:50:41.594603 IP lpp.soma.cz.1521 > nb101.soma.cz.59436: Flags [P.], seq 84553:85986, ack 1055, win 410, options [nop,nop,TS val 2360649218 ecr 260542544], length 1433
14:50:41.617485 IP nb101.soma.cz.59436 > lpp.soma.cz.1521: Flags [P.], seq 1055:1072, ack 85986, win 501, options [nop,nop,TS val 260542569 ecr 2360649218], length 17
14:50:41.619466 IP lpp.soma.cz.1521 > nb101.soma.cz.59436: Flags [P.], seq 85986:87410, ack 1072, win 410, options [nop,nop,TS val 2360649243 ecr 260542569], length 1424
14:50:41.639607 IP nb101.soma.cz.59436 > lpp.soma.cz.1521: Flags [P.], seq 1072:1089, ack 87410, win 501, options [nop,nop,TS val 260542591 ecr 2360649243], length 17
14:50:41.641603 IP lpp.soma.cz.1521 > nb101.soma.cz.59436: Flags [P.], seq 87410:88519, ack 1089, win 410, options [nop,nop,TS val 2360649265 ecr 260542591], length 1109
14:50:41.661610 IP nb101.soma.cz.59436 > lpp.soma.cz.1521: Flags [P.], seq 1089:1106, ack 88519, win 501, options [nop,nop,TS val 260542613 ecr 2360649265], length 17
14:50:41.663631 IP lpp.soma.cz.1521 > nb101.soma.cz.59436: Flags [P.], seq 88519:89831, ack 1106, win 410, options [nop,nop,TS val 2360649287 ecr 260542613], length 1312
14:50:41.683631 IP nb101.soma.cz.59436 > lpp.soma.cz.1521: Flags [P.], seq 1106:1123, ack 89831, win 501, options [nop,nop,TS val 260542635 ecr 2360649287], length 17
14:50:41.687610 IP lpp.soma.cz.1521 > nb101.soma.cz.59436: Flags [P.], seq 89831:91228, ack 1123, win 410, options [nop,nop,TS val 2360649309 ecr 260542635], length 1397
14:50:41.709237 IP nb101.soma.cz.59436 > lpp.soma.cz.1521: Flags [P.], seq 1123:1140, ack 91228, win 501, options [nop,nop,TS val 260542661 ecr 2360649309], length 17
14:50:41.711213 IP lpp.soma.cz.1521 > nb101.soma.cz.59436: Flags [P.], seq 91228:92675, ack 1140, win 410, options [nop,nop,TS val 2360649335 ecr 260542661], length 1447
14:50:41.732151 IP nb101.soma.cz.59436 > lpp.soma.cz.1521: Flags [P.], seq 1140:1157, ack 92675, win 501, options [nop,nop,TS val 260542684 ecr 2360649335], length 17
14:50:41.737463 IP lpp.soma.cz.1521 > nb101.soma.cz.59436: Flags [.], seq 92675:94123, ack 1157, win 410, options [nop,nop,TS val 2360649358 ecr 260542684], length 1448

WRITE to firebird
root@nb101:~# tcpdump -i enx9cebe8f0b9d6  net 172.25.32.42/32
4:51:53.720705 IP nb101.soma.cz.39100 > devel.soma.cz.gds-db: Flags [P.], seq 78965:79057, ack 28000, win 501, options [nop,nop,TS val 1420455804 ecr 3705278798], length 92
14:51:53.722445 IP devel.soma.cz.gds-db > nb101.soma.cz.39100: Flags [P.], seq 28000:28032, ack 79057, win 3123, options [nop,nop,TS val 3705278800 ecr 1420455804], length 32
14:51:53.722524 IP nb101.soma.cz.39100 > devel.soma.cz.gds-db: Flags [P.], seq 79057:79141, ack 28032, win 501, options [nop,nop,TS val 1420455806 ecr 3705278800], length 84
14:51:53.724239 IP devel.soma.cz.gds-db > nb101.soma.cz.39100: Flags [P.], seq 28032:28064, ack 79141, win 3123, options [nop,nop,TS val 3705278802 ecr 1420455806], length 32
14:51:53.724343 IP nb101.soma.cz.39100 > devel.soma.cz.gds-db: Flags [P.], seq 79141:79233, ack 28064, win 501, options [nop,nop,TS val 1420455808 ecr 3705278802], length 92
14:51:53.726074 IP devel.soma.cz.gds-db > nb101.soma.cz.39100: Flags [P.], seq 28064:28096, ack 79233, win 3123, options [nop,nop,TS val 3705278804 ecr 1420455808], length 32
14:51:53.726158 IP nb101.soma.cz.39100 > devel.soma.cz.gds-db: Flags [P.], seq 79233:79321, ack 28096, win 501, options [nop,nop,TS val 1420455810 ecr 3705278804], length 88
14:51:53.727851 IP devel.soma.cz.gds-db > nb101.soma.cz.39100: Flags [P.], seq 28096:28128, ack 79321, win 3123, options [nop,nop,TS val 3705278805 ecr 1420455810], length 32
14:51:53.729830 IP nb101.soma.cz.39100 > devel.soma.cz.gds-db: Flags [P.], seq 79321:79413, ack 28128, win 501, options [nop,nop,TS val 1420455813 ecr 3705278805], length 92
14:51:53.731675 IP devel.soma.cz.gds-db > nb101.soma.cz.39100: Flags [P.], seq 28128:28160, ack 79413, win 3123, options [nop,nop,TS val 3705278809 ecr 1420455813], length 32
14:51:53.731732 IP nb101.soma.cz.39100 > devel.soma.cz.gds-db: Flags [P.], seq 79413:79505, ack 28160, win 501, options [nop,nop,TS val 1420455815 ecr 3705278809], length 92
14:51:53.733644 IP devel.soma.cz.gds-db > nb101.soma.cz.39100: Flags [P.], seq 28160:28192, ack 79505, win 3123, options [nop,nop,TS val 3705278811 ecr 1420455815], length 32
14:51:53.733697 IP nb101.soma.cz.39100 > devel.soma.cz.gds-db: Flags [P.], seq 79505:79593, ack 28192, win 501, options [nop,nop,TS val 1420455817 ecr 3705278811], length 88
14:51:53.735549 IP devel.soma.cz.gds-db > nb101.soma.cz.39100: Flags [P.], seq 28192:28224, ack 79593, win 3123, options [nop,nop,TS val 3705278813 ecr 1420455817], length 32
14:51:53.735602 IP nb101.soma.cz.39100 > devel.soma.cz.gds-db: Flags [P.], seq 79593:79685, ack 28224, win 501, options [nop,nop,TS val 1420455819 ecr 3705278813], length 92
14:51:53.737391 IP devel.soma.cz.gds-db > nb101.soma.cz.39100: Flags [P.], seq 28224:28256, ack 79685, win 3123, options [nop,nop,TS val 3705278815 ecr 1420455819], length 32
14:51:53.737453 IP nb101.soma.cz.39100 > devel.soma.cz.gds-db: Flags [P.], seq 79685:79777, ack 28256, win 501, options [nop,nop,TS val 1420455821 ecr 3705278815], length 92
14:51:53.739234 IP devel.soma.cz.gds-db > nb101.soma.cz.39100: Flags [P.], seq 28256:28288, ack 79777, win 3123, options [nop,nop,TS val 3705278817 ecr 1420455821], length 32
14:51:53.739290 IP nb101.soma.cz.39100 > devel.soma.cz.gds-db: Flags [P.], seq 79777:79869, ack 28288, win 501, options [nop,nop,TS val 1420455823 ecr 3705278817], length 92
14:51:53.741136 IP devel.soma.cz.gds-db > nb101.soma.cz.39100: Flags [P.], seq 28288:28320, ack 79869, win 3123, options [nop,nop,TS val 3705278819 ecr 1420455823], length 32
14:51:53.741199 IP nb101.soma.cz.39100 > devel.soma.cz.gds-db: Flags [P.], seq 79869:79961, ack 28320, win 501, options [nop,nop,TS val 1420455825 ecr 3705278819], length 92
14:51:53.743089 IP devel.soma.cz.gds-db > nb101.soma.cz.39100: Flags [P.], seq 28320:28352, ack 79961, win 3123, options [nop,nop,TS val 3705278821 ecr 1420455825], length 32
14:51:53.743182 IP nb101.soma.cz.39100 > devel.soma.cz.gds-db: Flags [P.], seq 79961:80053, ack 28352, win 501, options [nop,nop,TS val 1420455827 ecr 3705278821], length 92
14:51:53.745064 IP devel.soma.cz.gds-db > nb101.soma.cz.39100: Flags [P.], seq 28352:28384, ack 80053, win 3123, options [nop,nop,TS val 3705278823 ecr 1420455827], length 32
14:51:53.745144 IP nb101.soma.cz.39100 > devel.soma.cz.gds-db: Flags [P.], seq 80053:80145, ack 28384, win 501, options [nop,nop,TS val 1420455829 ecr 3705278823], length 92

Further I experiment with PURE_JAVA and NATIVE type of Jaybird connection.

Jaybird with pure-java driver, output from tcpdump:

12:12:30.573764 IP nb101.soma.cz.48724 > devel.soma.cz.gds-db: Flags [P.], seq 187597:187753, ack 45905, win 501, options [nop,nop,TS val 2850058491 ecr 3782115660], length 156
12:12:30.575509 IP devel.soma.cz.gds-db > nb101.soma.cz.48724: Flags [P.], seq 45905:45937, ack 187753, win 3073, options [nop,nop,TS val 3782115662 ecr 2850058491], length 32
12:12:30.575622 IP nb101.soma.cz.48724 > devel.soma.cz.gds-db: Flags [P.], seq 187753:187909, ack 45937, win 501, options [nop,nop,TS val 2850058492 ecr 3782115662], length 156
12:12:30.577428 IP devel.soma.cz.gds-db > nb101.soma.cz.48724: Flags [P.], seq 45937:45969, ack 187909, win 3073, options [nop,nop,TS val 3782115664 ecr 2850058492], length 32

Jaybird with native driver (tested that it is really used, after complete removal of firebird libraries libfbclient.so from PC, stopped working), firebird version is the last 3.0 - 3.0.10 output from tcpdump:

12:15:12.561536 IP nb101.soma.cz.48726 > devel.soma.cz.gds-db: Flags [P.], seq 84984:85144, ack 16993, win 501, options [nop,nop,TS val 2850220478 ecr 3782277647], length 160
12:15:12.563424 IP devel.soma.cz.gds-db > nb101.soma.cz.48726: Flags [P.], seq 16993:17025, ack 85144, win 3073, options [nop,nop,TS val 3782277649 ecr 2850220478], length 32
12:15:12.565566 IP nb101.soma.cz.48726 > devel.soma.cz.gds-db: Flags [P.], seq 85144:85304, ack 17025, win 501, options [nop,nop,TS val 2850220482 ecr 3782277649], length 160
12:15:12.567377 IP devel.soma.cz.gds-db > nb101.soma.cz.48726: Flags [P.], seq 17025:17057, ack 85304, win 3073, options [nop,nop,TS val 3782277653 ecr 2850220482], length 32

It can be seen that the length of the packet increased slightly, but otherwise nothing.

I also thought of experimenting with jaybird addBatch and executeBatch. I.e. instead of dropping individual inserts into the database in a while loop, I tried to do insertStmt.addBatch () in the while loop and finally insertStmt.executeBatch ()

Jaybird with pure-java driver, output from tcpdump:

12:27:53.839979 IP nb101.soma.cz.48730 > devel.soma.cz.gds-db: Flags [P.], seq 216176:216332, ack 120101, win 501, options [nop,nop,TS val 2850981757 ecr 3783038921], length 156
12:27:53.841803 IP devel.soma.cz.gds-db > nb101.soma.cz.48730: Flags [P.], seq 120101:120133, ack 216332, win 3073, options [nop,nop,TS val 3783038923 ecr 2850981757], length 32
12:27:53.841943 IP nb101.soma.cz.48730 > devel.soma.cz.gds-db: Flags [P.], seq 216332:216356, ack 120133, win 501, options [nop,nop,TS val 2850981759 ecr 3783038923], length 24
12:27:53.843671 IP devel.soma.cz.gds-db > nb101.soma.cz.48730: Flags [P.], seq 120133:120201, ack 216356, win 3073, options [nop,nop,TS val 3783038925 ecr 2850981759], length 68
12:27:53.843825 IP nb101.soma.cz.48730 > devel.soma.cz.gds-db: Flags [P.], seq 216356:216512, ack 120201, win 501, options [nop,nop,TS val 2850981761 ecr 3783038925], length 156
12:27:53.845595 IP devel.soma.cz.gds-db > nb101.soma.cz.48730: Flags [P.], seq 120201:120233, ack 216512, win 3073, options [nop,nop,TS val 3783038927 ecr 2850981761], length 32
12:27:53.845684 IP nb101.soma.cz.48730 > devel.soma.cz.gds-db: Flags [P.], seq 216512:216536, ack 120233, win 501, options [nop,nop,TS val 2850981763 ecr 3783038927], length 24
12:27:53.847415 IP devel.soma.cz.gds-db > nb101.soma.cz.48730: Flags [P.], seq 120233:120301, ack 216536, win 3073, options [nop,nop,TS val 3783038929 ecr 2850981763], length 68

Jaybird with native driver

12:25:28.759256 IP nb101.soma.cz.48728 > devel.soma.cz.gds-db: Flags [P.], seq 275580:275740, ack 149801, win 501, options [nop,nop,TS val 2850836676 ecr 3782893841], length 160
12:25:28.761086 IP devel.soma.cz.gds-db > nb101.soma.cz.48728: Flags [P.], seq 149801:149833, ack 275740, win 3073, options [nop,nop,TS val 3782893843 ecr 2850836676], length 32
12:25:28.761139 IP nb101.soma.cz.48728 > devel.soma.cz.gds-db: Flags [P.], seq 275740:275764, ack 149833, win 501, options [nop,nop,TS val 2850836678 ecr 3782893843], length 24
12:25:28.762937 IP devel.soma.cz.gds-db > nb101.soma.cz.48728: Flags [P.], seq 149833:149901, ack 275764, win 3073, options [nop,nop,TS val 3782893845 ecr 2850836678], length 68
12:25:28.762996 IP nb101.soma.cz.48728 > devel.soma.cz.gds-db: Flags [P.], seq 275764:275924, ack 149901, win 501, options [nop,nop,TS val 2850836680 ecr 3782893845], length 160
12:25:28.766831 IP devel.soma.cz.gds-db > nb101.soma.cz.48728: Flags [P.], seq 149901:149933, ack 275924, win 3073, options [nop,nop,TS val 3782893847 ecr 2850836680], length 32
12:25:28.766959 IP nb101.soma.cz.48728 > devel.soma.cz.gds-db: Flags [P.], seq 275924:275948, ack 149933, win 501, options [nop,nop,TS val 2850836684 ecr 3782893847], length 24
12:25:28.769597 IP devel.soma.cz.gds-db > nb101.soma.cz.48728: Flags [P.], seq 149933:150001, ack 275948, win 3073, options [nop,nop,TS val 3782893851 ecr 2850836684], length 68

Here you can see that compared to the classic insertStmt.execute (), there are two more packets in the packet pattern for both implementations - this also confirms the measured result that the runtime is even worse via addBatch and executeBatch!

So for me, for the given task (tens of thousands of inserts into firebird via Jaybird) there is no difference in whether I use the pure-java implementation or the native library of the latest version (firebird 3).

AlexPeshkoff commented 2 years ago

On 6/30/22 17:45, hubeny wrote:

The question of this issue is: Can I somehow force Jaybird/Firebird to use the full size of the network packet (1500 MTU) when making lot of inserts to database in order to have smaller number of packets (which will be routed on network with higher latency)?

The short answer for Firebird is yes, using IBatch (in FB4).

hubeny commented 2 years ago

OK, thank you. Firebird v4 has IBatch. Are there plans to backport this feature to FB3?

Is IBatch feature of Firebird supported in Jaybird? If yes,
1) In which version of Jaybird is IBatch feature of FB4 supported? 2) Is it supported both in PURE_JAVA and NATIVE driver?

AlexPeshkoff commented 2 years ago

On 7/1/22 09:44, hubeny wrote:

OK, thank you. Firebird v4 has IBatch. Are there plans to backport this feature to FB3?

No.

Is IBatch feature of Firebird supported in Jaybird? If yes,

  1. In which version of Jaybird is IBatch feature of FB4 supported?
  2. Is it supported both in PURE_JAVA and NATIVE driver?

Should be present - it was designed in a way to support JDBC batches.

vasiliy-yashkov commented 2 years ago

Hi, @hubeny !

If you are interested in using batch operations in jaybird, you can try our Red Soft Jaybird driver with IBatch implementation. IBatch is supported in NATIVE driver only. You can also use nexus repository to download artifacts:

<repository>
    <id>nexus-red-soft-ru-jaybird</id>
    <url>https://nexus.red-soft.ru/repository/jaybird</url>
</repository>
...
<dependency>
    <groupId>ru.red-soft.jdbc</groupId>
    <artifactId>jaybird-jdk18</artifactId>
    <version>4.0.18</version>
</dependency>

The use of batches is available via protocols jdbc:firebirdsql:fboo:native: or jdbc:firebird:fboo:native:. Also you need to make sure that fbclient library is located on the PATH environment variable (on Windows) or LD_LIBRARY_PATH environment variable (on Linux).

If you have any questions, write to vasiliy.yashkov@red-soft.ru

mrotteveel commented 2 years ago

On 7/1/22 09:44, hubeny wrote: OK, thank you. Firebird v4 has IBatch. Are there plans to backport this feature to FB3?

Is IBatch feature of Firebird supported in Jaybird? If yes, 1. In which version of Jaybird is IBatch feature of FB4 supported? 2. Is it supported both in PURE_JAVA and NATIVE driver?

Should be present - it was designed in a way to support JDBC batches.

No, it is not present, as documented in the Jaybird 4 release notes, in Notes on Firebird 4 support:

Jaybird 4 [..] does not implement the new batch protocol.

The normal Jaybird implementation does not support the batching added in the Firebird protocol. I haven't gotten around to implementing it, and given the protocol and packet content for batching was (and partially still is) undocumented, it hasn't been a priority for me.

mrotteveel commented 2 years ago

This would have been more suitable to ask on the firebird-java Google Group than here, at least as a starting point.

The basic problem is that Jaybird needs to execute each insert individually, and that means that each execute (assuming auto-commit mode is false, and use of a prepared statement) will generate at minimum one request to the server, and one response back (with either "success" or an error if the execute failed) for each execute. This also means it is not possible to "force Jaybird/Firebird to use the full size of the network packet"

If an update count is needed (e.g. executeUpdate or batch execution), it will need 2 requests to the server, and 2 responses. This also explains your observed difference between using execute() and addBatch()/executeBatch(), as this last option requires the update counts (per set of parameters!).

Jaybird emulates JDBC batch execution, so it doesn't result in any performance gain. In fact, as you discovered, compared to execute() it performs even worse, though in auto-commit mode it would perform better compared to individual executes as it avoids transaction commit + transaction start (which requires an extra 2 requests and 2 responses).

Based on the provided information, I don't think there is a possibility to improve performance, other than Jaybird implementing support for Firebird 4+ batch execution. As Vasiliy mentioned, if you're using Firebird 4, you could also switch to RedSoft's fork of Jaybird and use their support for batch execution (native only).

mrotteveel commented 2 years ago

Created #695 for implementing Firebird 4 batch protocol. I'm closing this issue. For further discussion, I suggest posting to firebird-java.