fetchraw is broken in cx_Oracle 8.1

talebi1 commented 3 years ago

What versions are you using? cx_Oracle 8.1 Oracle 19c platform.platform: Windows-10-10.0.18362-SP0 sys.maxsize > 2**32: False platform.python_version: 3.7.5 cx_Oracle.version: 8.1.0 cx_Oracle.clientversion: (19, 8, 0, 0, 0)
Is it an error or a hang or a crash?

fetchraw API is fetching the wrong number of rows in the cursor

What error(s) or behavior you are seeing? Empty cursor (sometimes fewer rows than what's in the table if arraysize parameter is tuned)

Include a runnable Python script that shows the problem.


import cx_Oracle
conn = cx_Oracle.connect(connect_string )
cursor = conn.cursor()
cursor.arraysize = 2000
cursor.prefetchrows = 2000
cursor.execute("SELECT 1,2,3 from dual")
rows = cursor.fetchraw() # numRows=2000

print(f'rowcount={cursor.rowcount}') for row in cursor: print(row)

print("done.")


**cx_Oracle 5.x output:**

rowcount=1 (1, 2, 3) done.


**cx_Oracle 8.1 output:**

rowcount=1 done.



I understand that fetchraw will be deprecated in 8.2 but we heavily relay on it and its cursor internals in our C implementation and cannot move to fetchmany anytime soon.

I would appreciate if someone could look into this issue.

Thanks in advance.

anthony-tuininga commented 3 years ago

The intent with fetchraw() was to populate the fetch variables and return the number of rows fetched -- not to support iteration! If you intend to use iteration, why bother with fetchraw()? Just perform the iteration! Set your arraysize and it will internally fetch as many rows as are needed the first time you request a row and thereafter will "fetch" from the internal buffers. Are there other ways that you are using fetchraw()? The example provided doesn't appear to be a valid use case -- but if you have other ones that are valid the method could certainly be retained!

anthony-tuininga commented 3 years ago

In my opinion, the bug was in cx_Oracle 5.x that allowed the behavior you appear to depend on. :-)

talebi1 commented 3 years ago

In my opinion, the bug was in cx_Oracle 5.x that allowed the behavior you appear to depend on. :-)

Thank you @anthony-tuininga for you reply. unfortunately that's how it was implemented in our 20+ years legacy code. any chance fetchraw can be amended to recover the old behavior?

anthony-tuininga commented 3 years ago

My point is that in the code you provided you can simply eliminate the call to fetchraw() and the data will be fetched as expected. The value of cursor.rowcount will only increase as each row is fetched, but that's about the only difference. It doesn't make sense to "recover" the old behavior since it was wrong to begin with!

Can you provide a complete sample of the code you are using? I might be able to suggest some minimal changes that will work for you and won't involve re-introducing old bugs!

talebi1 commented 3 years ago

That is just an example to illustrate the issue. we are calling fetchraw from C code to retrieve the number of rows in the cursor and then allocating the appropriate memory before parsing the internal cursor buffers.

This below code used to work in 5.x

cursor = conn.cursor()
cursor.arraysize = 2000
cursor.execute("SELECT 1,2,3 from dual")
number_of_rows = cursor.fetchraw()
print(f'rowcount={cursor.rowcount}')
print(f'number_of_rows={number_of_rows}')

for row in cursor:
    print(row)

print("done.")

cx_Oracle 5.x output

rowcount=1
number_of_rows=1
(1, 2, 3)
done.

cx_Oracle 8.1 output

rowcount=1
number_of_rows=1
done.

If I remove the fetchraw call, I can no longer get the number of rows in the cursor because cursor.rowcount will be reset to 0.

cursor = conn.cursor()
cursor.arraysize = 2000
cursor.execute("SELECT 1,2,3 from dual")
print(f'rowcount={cursor.rowcount}')

for row in cursor:
    print(row)

print("done.")

### output ###
# rowcount=0
# (1, 2, 3)
# done.

It would help if there is a way to get the number of records in the cursor before fetching it, would that be possible?

cjbj commented 3 years ago

@talebi1 I am generally more conservative than @anthony-tuininga and will chat with him about this. However, what is the real issue: it doesn't look like you have upgraded cx_Oracle for some time? Any major software upgrade, particularly skipping some major versions, is bound to uncover some technical debt in a system. If you are going to spend effort upgrading the infrastructure (e.g. cx_Oracle from 5 to 8), then why are you not budgeting for some code refactoring?

talebi1 commented 3 years ago

@cjbj there is an inconsistent behavior after calling fetchraw between 5.x and 8.1. it is no longer possible to loop over rows in the new 8.1 release after calling fetchraw (see my previous example).

talebi1 commented 3 years ago

The intent with fetchraw() was to populate the fetch variables and return the number of rows fetched -- not to support iteration! If you intend to use iteration, why bother with fetchraw()? Just perform the iteration! Set your arraysize and it will internally fetch as many rows as are needed the first time you request a row and thereafter will "fetch" from the internal buffers. Are there other ways that you are using fetchraw()? The example provided doesn't appear to be a valid use case -- but if you have other ones that are valid the method could certainly be retained!

@anthony-tuininga do you know of a way to get the number of fetched rows in the cursor before starting the iteration? we need that information to allocate the right size of memory in our C code (we use cx_Oracle in our C code, but that's another discussion :) )

cjbj commented 3 years ago

@talebi1 to ask my question more directly: why do you think you need to upgrade?

talebi1 commented 3 years ago

@cjbj the upgrade was forced to us by our security policy to not use old and deprecated 3rd party libraries

anthony-tuininga commented 3 years ago

@anthony-tuininga do you know of a way to get the number of fetched rows in the cursor before starting the iteration? we need that information to allocate the right size of memory in our C code (we use cx_Oracle in our C code, but that's another discussion :) )

I won't ask you why you need to use Python from your C code. :-) There is still the question of why you need to allocate space, though -- when cx_Oracle is already doing that! And even if you do need to duplicate cx_Oracle's code, you can simply do what it does -- allocate space corresponding to the array size. Or you can use a fixed array size of whatever value you desire and simply iterate until no more rows are found or your array is full -- then do with it what you desire. I think there are plenty of good ways to modify your code that will do what you need -- at least based on what you have told me so far!

The change you are referring to was made in cx_Oracle 6 -- about 4 years ago. Although I wasn't aware of the issue, now that I am I stand behind what I said earlier: the behavior you are exploiting was a bug in cx_Oracle 5 which was unknowingly fixed in cx_Oracle 6.

Another possible solution for you: create a subclassed Cursor which has the behavior you desire. Something like this (untested):

class MyConnection(cx_Oracle.Connection):

    def cursor(self):
        return MyCursor(self)

class MyCursor(cx_Oracle.Cursor):

    def __init__(self, connection):
        super().__init__(connection)
        self._fetched_rows = []

    def __iter__(self):
        return iter(self._fetched_rows)

    def fetchraw(self, num_rows):
        self._fetched_rows = self.fetchmany(num_rows)
        return len(self._fetched_rows)

Then in your application you simply need to make sure that you use the MyConnection class instead of calling cx_Oracle.connect() directly. I'm not sure of any of the other requirements your application has so this may not work for you. But it should give you an idea of what you can do to avoid having to change too much at once.

talebi1 commented 3 years ago

@anthony-tuininga: That was an unfortunate architectural decision made 15 years ago by a fervent python supporter in the team :) I would change that if I can go back in time (but It won’t be my top priority 😊) We need to pre-allocate memory in our software for our internal structures for performances reasons, we have to know in advance the exact number of rows to allocate because it can be a couple of records to thousands or millions; so using a fixed array size is not doable with the current architecture.

Thank you for your code above, but the only issue with it is that once we get the number of records in the cursor, we need to loop over it (in C because we also relay the cursor memory internals to make few OCI calls, please don’t ask why 😊)

The above code might partially solve the issue if there was a way to scroll back to the initial position of the cursor and loop over it again. I have tried the scroll API, but it did not work because the cursor was not scrollable. Do you think that’s possible?

anthony-tuininga commented 3 years ago

Did you find a solution that suits you? Scrollable cursors might work for you but I would not recommend their use generally as they require overhead on the database.

talebi1 commented 3 years ago

@anthony-tuininga indeed, it was possible to loop over the cursor returned by fetchraw using the cursor.fetchvars object. (we have also messed the arraysize initialization order (overriding the arraysize was working in the old version when done after the prepare statement but no longer the case in 8.x)

Issue resolved now but we will be moving to fetchmany in the near futur. thanks again for you and @cjbj for you great work on cx_Oracle :)

oracle / python-cx_Oracle

fetchraw is broken in cx_Oracle 8.1 #575