oracle / python-cx_Oracle

Python interface to Oracle Database now superseded by python-oracledb
https://oracle.github.io/python-cx_Oracle
Other
888 stars 361 forks source link

Large Query-Sets (> 10K Rows) Extremely Slow fetchall() and fetchmany() #139

Closed ar090 closed 6 years ago

ar090 commented 6 years ago

Details

We are experiencing a linear time complexity when returning larger query-sets (anything over 10k rows). To eliminate all possible factors stemming from table structure and/or query complexity, I ended up creating a row generator which unfortunately suffers the same fate, but demonstrates the problem nicely.

The following query takes about 2 minutes to complete for 100k rows when executed in terminal:

import cx_Oracle
conn = cx_Oracle.connect('admin/{password}@(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=127.0.0.1)(PORT=1521))(CONNECT_DATA=(SID=ORCL)))')
cur = conn.cursor()
cur.arraysize = 500

raw = """
    WITH data(r) AS (
    SELECT 1 r FROM dual
    UNION ALL
    SELECT r+1 FROM data WHERE r < 100000)
    SELECT r FROM data
"""
cur.execute(raw).fetchall()

image

Ultimately in troubleshooting this we noticed the same time complexities with fetchmany(x) and playing with arraysize while testing with the following:

import time

def result_iter(cursor):
    result_count = 0
    while True:
        result_count += cur.arraysize
        start = time.time()
        results = cursor.fetchmany(numRows=cur.arraysize)
        print(result_count)
        print(start - time.time())
        if not results:
            break
        for result in results:
            yield result

results = []
for batch in result_iter(cur.execute(raw)):
    results.append(batch)

For comparison sake, here is the same raw sql code executed in Data Grip utilizing jdbc drivers:

image

Can you think of anything that would be responsible for this performance with my current installation/settings? Thanks a ton in advance!

Answer the following questions:

  1. What is your version of Python? Is it 32-bit or 64-bit? 3.5.3 64-bit

  2. What is your version of cx_Oracle? 6.1

  3. What is your version of the Oracle client (e.g. Instant Client)? How was it installed? Where is it installed? Instant Client ~/lib

  4. What is your version of the Oracle Database? Oracle Standard Edition Two 12.1.0.2.v10

  5. What is your OS and version? MacOS Sierra 10.12.6

  6. What compiler version did you use? Apple LLVM version 9.0.0 (clang-900.0.38)

  7. What environment variables did you set? How exactly did you set them? export ORACLE_HOME=/Users/user-name/instantclient_12_2 export LD_LIBRARY_PATH=/Users/user-name/instantclient_12_2 export DYLD_LIBRARY_PATH=/Users/user-name/instantclient_12_2 export OCI_DIR=/Users/user-name/instantclient_12_2

  8. What exact command caused the problem (e.g. what command did you try to install with)? Who were you logged in as?

  9. What error(s) you are seeing? I did not get any errors just incredibly slow execution. RDS Database CPU remains under 3% during execution.

anthony-tuininga commented 6 years ago

I just tried this using Oracle Client and Database 12.2 on 64-bit Linux. The code you supplied took less than one second to execute and fetch! Just to make sure that we are using exactly the same code, this is what I did:

import cx_Oracle
import datetime

conn = cx_Oracle.Connection("cx_Oracle/welcome@localhost/T122")
cursor = conn.cursor()
cursor.arraysize = 500

print("Starting...")
startTime = datetime.datetime.today()

sql = """
    WITH data(r) AS (
    SELECT 1 r FROM dual
    UNION ALL
    SELECT r+1 FROM data WHERE r < 100000)
    SELECT r FROM data"""
results = cursor.execute(sql).fetchall()

endTime = datetime.datetime.today()
print("Ended...in", endTime - startTime)
print("Fetched", len(results), "rows...")

And I got these results:

Starting... Ended...in 0:00:00.684277 Fetched 100000 rows...

Admittedly this is with a local database but unless the round trip to your database is significant you still shouldn't see anything in the minute range!

I'll see about getting this tested on a Mac next week.

ar090 commented 6 years ago

Thanks a ton for looking into this over the weekend! Your comment definitely gave me some insight. First off when playing with the arraysize I noticed that we cannot dynamically set this once the cursor has established its first query. I think these screenshots illustrate my point better:

screenshot 2018-01-29_09-05-10

screenshot 2018-01-29_08-50-21

Notice that without the dropping the cursor in each iteration it seems like the original specified arraysize sticks. I think this was throwing me off in testing and I am not sure if this is expected behavior. Being an object property on the cursor I feel like I could mutate it in between executions.

With regards to testing on a mac that would be awesome! As you can see my times are still not ideal. While the screenshots above would point to slow round trips, our executions from datagrip utilize the same DB just with different drivers. Just for reference we are tunneling an Oracle RDS instance locally. I am desperately still searching for results like you posted and any additional help is super appreciated. For what it's worth with my current installations on Ubuntu - I am seeing the same results you posted previously:

image 1

Please feel free to let me know if I can help in any way.

anthony-tuininga commented 6 years ago

I just tried on a Mac and got these results with arraysize 500, Oracle Database 12.1 (on Linux) and Oracle Client 12.2:

Starting... Ended...in 0:00:10.778614 Fetched 100000 rows...

With the same configuration but arraysize 5000 I get the following:

Starting... Ended...in 0:00:03.049976 Fetched 100000 rows...

Those are definitely slower but the database is on a different machine and the network between the two machines has a ping time of ~20-25 ms, as opposed to < 1 ms on a true LAN -- which explains the reason for the different performance!

I also looked at the arraysize issue you mentioned. The arraysize value is only examined when a new statement is prepared. This is to improve performance since otherwise new fetch areas would have to be allocated for each execution. If you want to change the arraysize, just create a new cursor!

anthony-tuininga commented 6 years ago

I am going to assume this answered your questions. If not, feel free to re-open.

ssfdust commented 6 years ago

I don't know if I should reopen the issue, but I meet the same problem as ar090.It has nothing to do with the performance by switching the arraysize.I have tried to change the arraysize to 5000, but it is still very slow.

    child_sql = """
        SELECT vehicle_register.vin, max(vehicle_register.vehicle_login_time) AS vehicle_login_time
        FROM vehicle_register
        GROUP BY vehicle_register.vin
    """
    test_sql = """
        SELECT g1.vin, g2.vehicle_login_time, g1.is_online, g1.sim, g1.batcode
        FROM vehicle_register g1
        JOIN ( SELECT vehicle_register.vin, max(vehicle_register.vehicle_login_time) AS vehicle_login_time
          FROM vehicle_register
          GROUP BY vehicle_register.vin) g2 ON g1.vin = g2.vin AND g1.vehicle_login_time = g2.vehicle_login_time
    """

    ora_cur = conn.cursor()
    print(datetime.now())
    #ora_cur.execute(view_sql, params)
    ora_cur.arraysize = 5000
    ora_cur.execute(child_sql)
    print(datetime.now())
    ora_cur.execute(test_sql)
    all_updated_cars = ora_cur.fetchall()
    print(datetime.now())
 python test.py
2018-04-08 22:57:37.527903
2018-04-08 22:57:39.262950
2018-04-08 22:58:11.856672

However, it takes just 1.8 seconds in pl/sql developer to execute the same query.coulud you give me some help? screenshot_20180408_230039

cjbj commented 6 years ago
ssfdust commented 6 years ago

Can you share the column types in vehicle_register?

column_name            column_data    length
VIN                    VARCHAR2           20
VEHICLE_SERIAL_NUMBER  NUMBER             22
BATPAGE_TOTAL_NUMBER   NUMBER             22
VEHICLE_LOGIN_TIME     DATE                7
SIM                    VARCHAR2           20
IS_ONLINE              NUMBER             22
SYSDT                  DATE                7
BATCODE_LENGTH         NUMBER             22
BATCODE                VARCHAR2         1000

Can you confirm you are running Python and the GUI tool on the same machine? No.But the machine I run python on is in the same LAN as the one with GUI tool.The one running python is linux, the other is windows server. here is lag.It's very low.

ping 10.2.1.190
PING 10.2.1.190 (10.2.1.190) 56(84) bytes of data.
64 bytes from 10.2.1.190: icmp_seq=1 ttl=128 time=0.153 ms
64 bytes from 10.2.1.190: icmp_seq=2 ttl=128 time=0.119 ms
64 bytes from 10.2.1.190: icmp_seq=3 ttl=128 time=0.165 ms

Can you triple check you are connecting to the same database as the same user? Yes, I confirm. what version of cx_Oracle and Oracle client libraries and Oracle database do you have? cx_Oracle's version is 6.0rc2, the version of Oracle database is oracle 10g.I don't know how to get my client library version. @cjbj , thanks for paying attention to my question.

ssfdust commented 6 years ago

@cjbj Um.... I don't know how to explain.I didn't change anything, but everything is ok then...

anthony-tuininga commented 6 years ago

Everything is now ok? That's good, I guess. :-) I'd suggest upgrading cx_Oracle to 6.2.1, though, instead of using a release candidate!

WilliamLHarms commented 5 years ago

I have a question related to this post. I'm using the latest cx_oracle package, using instantclient_18_3. I have the max array size set to 5k. When I run this in a scheduler, I get intermittent performance. It does not seem to be related to row size. Sometimes I pull down 500k rows with 2-3 seconds between fetchmany's.. Other times I can't pull down 20k rows in an hour with the max array size set to 5k.

When I throw extra debug into my python script, I see that the pause is in calling the fetchmany()

My questions:

My DBA has looked at the performance of our oracle instance, she said that the DB appears to be healthy and doesn't have further advice.

anthony-tuininga commented 5 years ago

You can use the debugging available with ODPI-C. That is documented here. Specifically set the environment variable DPI_DEBUG_LEVEL to the value 4. That will give you the time spent inside the ODPI-C function, which is a thin wrapper on top of the Oracle Client libraries. If the time is spent there you'll want to look at the database more closely. AWR may be able to help you out in that regard. If the majority of the time is spent outside of those functions, let me know.

As for fetch array size, the difference between 500 and 5000 will be minimal unless there is a significant network latency between the client and the server.

WilliamLHarms commented 5 years ago

Thanks for the details.. I turned on the additional logging, however, what I am seeing is that the time between each fetchmany() is growing larger and larger on retrieving the data. (see attached)

I've got a log file attached, I'll get with our DBA tomorrow morning to pull the AWR.

If you have any suggestions or insight I would happly take it.

theLogs.txt

anthony-tuininga commented 5 years ago

The log shows that the time is being taken inside the actual fetch itself. So it looks to me like something in the database is causing this. I'd check for things like whether an index is being used due to things like complex queries, NVARCHAR2 vs VARCHAR2, etc. Hopefully the AWR will help you pinpoint the issue.

WilliamLHarms commented 5 years ago

Exactly.. What's very odd.. Sometimes it runs very quickly, 10-15 seconds per 5k rows... After a 2-3 min warmup on the execute.

DBA checked out for the night, will chat with her tomorrow and review the AWR.

Kushagra2000 commented 4 years ago

@anthony-tuininga I also have a very long runtime on cx_Oracle but in my case it can be attributed to a distant server. I run the PL/SQL code on SQL Developer and it executes completely in 19 seconds while the equivalent python code takes 1 min 20 seconds with cx_Oracle on my own device. I pinged the server IP and noticed that I got an average ping of 600 ms with a max of 750 ms and I execute around 200 queries in my python code. Does the cx_Oracle python execution time vary drastically with the round trip time of the database server? If I ran the same python code locally on the server containing the database, how much time do you estimate I can shave off?

cjbj commented 4 years ago

@Kushagra2000 this is a very old, and closed issue. Start by reading Tuning cx_Oracle. If you still have a problem, please open a new issue. Thanks!