oracle / python-cx_Oracle

Python interface to Oracle Database now superseded by python-oracledb
https://oracle.github.io/python-cx_Oracle
Other
888 stars 361 forks source link

How to increase logging level? #310

Closed lrhazi closed 5 years ago

lrhazi commented 5 years ago

Is there a way to increase the verbosity level of this module so as to get debugging or even tracing logs?

Frequently my connections fail with this output:

ORA-03135: connection lost contact
Process ID: 2996
Session ID: 13 Serial number: 61015

And wonder if I could get more details, like did it connect first properly, did it send sql statement and was waiting on response?

Thanks a lot.

lrhazi commented 5 years ago

I found out my issue occurs not during "connect", but during "execute". What's odd is that when I start my script, with python -i, it fails, and then I simply recall the same funtion that failed, with the same sql statement, and most of the time, it works fine! Still, additional debugging log would be maybe useful. Thanks a lot.

anthony-tuininga commented 5 years ago

You can use ODPI-C debugging to get some additional information. In particular logging levels 4, 8 and 16 would be of most use with cx_Oracle. The error and your description of the problem sounds like it might be a network stability issue, though.

lrhazi commented 5 years ago

Thanks a lot Anthony. My question has been answered, but am gonna try to get more :)

am getting this output, any idea where to look next? I am afraid db server side? would a packet capture help me much?

ODPI [00423] 2019-05-21 14:50:34.847: fn start dpiContext_create
ODPI [00423] 2019-05-21 14:50:34.874: fn end dpiContext_create -> 0
ODPI [00423] 2019-05-21 14:50:34.874: fn start dpiContext_getClientVersion(0x2759190)
ODPI [00423] 2019-05-21 14:50:34.875: fn end dpiContext_getClientVersion(0x2759190) -> 0
ODPI [00423] 2019-05-21 14:50:34.875: fn start dpiContext_initCommonCreateParams(0x2759190)
ODPI [00423] 2019-05-21 14:50:34.875: fn end dpiContext_initCommonCreateParams(0x2759190) -> 0
ODPI [00423] 2019-05-21 14:50:34.875: fn start dpiContext_initConnCreateParams(0x2759190)
ODPI [00423] 2019-05-21 14:50:34.875: fn end dpiContext_initConnCreateParams(0x2759190) -> 0
ODPI [00423] 2019-05-21 14:50:34.875: fn start dpiConn_create(0x2759190)
ODPI [00423] 2019-05-21 14:50:35.041: fn end dpiConn_create(0x2759190) -> 0
ODPI [00423] 2019-05-21 14:50:35.041: fn start dpiConn_getEncodingInfo(0x2711fa0)
ODPI [00423] 2019-05-21 14:50:35.041: fn end dpiConn_getEncodingInfo(0x2711fa0) -> 0
DB: Got connected.
DB: Got cursor.
ODPI [00423] 2019-05-21 14:50:35.042: fn start dpiConn_prepareStmt(0x2711fa0)
ODPI [00423] 2019-05-21 14:50:35.042: SQL SELECT DISTINCT C.EMAIL_ADDR

FROM PS_EMAIL_ADDRESSES C,
     PS_JOB @OPSUSER_CS_HR D

WHERE C.PREF_EMAIL_FLAG = 'Y'
  AND C.EMAIL_ADDR <> ' '
  AND C.EMPLID = D.EMPLID
  AND D.EFFDT =
        (SELECT MAX(D_ED.EFFDT) FROM PS_JOB @OPSUSER_CS_HR D_ED
          WHERE D.EMPLID = D_ED.EMPLID
            AND D.EMPL_RCD = D_ED.EMPL_RCD
            AND D_ED.EFFDT <= SYSDATE)
  AND D.EFFSEQ =
        (SELECT MAX(D_ES.EFFSEQ) FROM PS_JOB @OPSUSER_CS_HR D_ES
          WHERE D.EMPLID = D_ES.EMPLID
            AND D.EMPL_RCD = D_ES.EMPL_RCD
            AND D.EFFDT = D_ES.EFFDT)
  AND D.EMPL_STATUS IN ('A','L','P','S')
  AND D.PAYGROUP = 'FRF'

ORDER BY C.EMAIL_ADDR
-- FETCH FIRST 10 ROWS ONLY
ODPI [00423] 2019-05-21 14:50:35.042: fn end dpiConn_prepareStmt(0x2711fa0) -> 0
ODPI [00423] 2019-05-21 14:50:35.042: fn start dpiStmt_getInfo(0x24e1250)
ODPI [00423] 2019-05-21 14:50:35.042: fn end dpiStmt_getInfo(0x24e1250) -> 0
ODPI [00423] 2019-05-21 14:50:35.042: fn start dpiStmt_setFetchArraySize(0x24e1250)
ODPI [00423] 2019-05-21 14:50:35.042: fn end dpiStmt_setFetchArraySize(0x24e1250) -> 0
ODPI [00423] 2019-05-21 14:50:35.042: fn start dpiStmt_execute(0x24e1250)
########## <long time goes by>
ODPI [00423] 2019-05-21 14:55:36.376: OCI error ORA-03135: connection lost contact
Process ID: 8580
Session ID: 4 Serial number: 61666 (dpiStmt_execute / execute)
ODPI [00423] 2019-05-21 14:55:36.376: fn end dpiStmt_execute(0x24e1250) -> -1
t=group-mgmt-test m=oracle p=ERROR ORA-03135: connection lost contact
Process ID: 8580
Session ID: 4 Serial number: 61666
ODPI [00423] 2019-05-21 14:55:36.378: fn start dpiStmt_release(0x24e1250)
ODPI [00423] 2019-05-21 14:55:36.378: fn end dpiStmt_release(0x24e1250) -> 0
ODPI [00423] 2019-05-21 14:55:36.378: fn start dpiConn_release(0x2711fa0)
ODPI [00423] 2019-05-21 14:55:36.379: fn end dpiConn_release(0x2711fa0) -> 0
lrhazi commented 5 years ago

Whats really odd is that, like I said, if I stayed in the same Python interpreter instance and right after the error, call back my code, it just works:

ORDER BY C.EMAIL_ADDR
-- FETCH FIRST 10 ROWS ONLY

ODPI [00423] 2019-05-21 14:57:55.188: fn end dpiConn_prepareStmt(0x2859710) -> 0
ODPI [00423] 2019-05-21 14:57:55.188: fn start dpiStmt_getInfo(0x25104e0)
ODPI [00423] 2019-05-21 14:57:55.188: fn end dpiStmt_getInfo(0x25104e0) -> 0
ODPI [00423] 2019-05-21 14:57:55.188: fn start dpiStmt_setFetchArraySize(0x25104e0)
ODPI [00423] 2019-05-21 14:57:55.188: fn end dpiStmt_setFetchArraySize(0x25104e0) -> 0
ODPI [00423] 2019-05-21 14:57:55.188: fn start dpiStmt_execute(0x25104e0)

ODPI [00423] 2019-05-21 14:58:06.079: fn end dpiStmt_execute(0x25104e0) -> 0
ODPI [00423] 2019-05-21 14:58:06.079: fn start dpiStmt_getRowCount(0x25104e0)
ODPI [00423] 2019-05-21 14:58:06.079: fn end dpiStmt_getRowCount(0x25104e0) -> 0
ODPI [00423] 2019-05-21 14:58:06.079: fn start dpiStmt_getQueryInfo(0x25104e0)
ODPI [00423] 2019-05-21 14:58:06.079: fn end dpiStmt_getQueryInfo(0x25104e0) -> 0
ODPI [00423] 2019-05-21 14:58:06.079: fn start dpiConn_newVar(0x2859710)
ODPI [00423] 2019-05-21 14:58:06.079: fn end dpiConn_newVar(0x2859710) -> 0
ODPI [00423] 2019-05-21 14:58:06.079: fn start dpiVar_getSizeInBytes(0x2805c50)
ODPI [00423] 2019-05-21 14:58:06.079: fn end dpiVar_getSizeInBytes(0x2805c50) -> 0
ODPI [00423] 2019-05-21 14:58:06.079: fn start dpiStmt_define(0x25104e0)
ODPI [00423] 2019-05-21 14:58:06.080: fn end dpiStmt_define(0x25104e0) -> 0
ODPI [00423] 2019-05-21 14:58:06.080: fn start dpiStmt_fetchRows(0x25104e0)
ODPI [00423] 2019-05-21 14:58:06.082: fn end dpiStmt_fetchRows(0x25104e0) -> 0
ODPI [00423] 2019-05-21 14:58:06.082: fn start dpiStmt_fetchRows(0x25104e0)
ODPI [00423] 2019-05-21 14:58:06.083: fn end dpiStmt_fetchRows(0x25104e0) -> 0
ODPI [00423] 2019-05-21 14:58:06.083: fn start dpiStmt_fetchRows(0x25104e0)
ODPI [00423] 2019-05-21 14:58:06.084: fn end dpiStmt_fetchRows(0x25104e0) -> 0
ODPI [00423] 2019-05-21 14:58:06.084: fn start dpiStmt_fetchRows(0x25104e0)
ODPI [00423] 2019-05-21 14:58:06.085: fn end dpiStmt_fetchRows(0x25104e0) -> 0
DB: Got results: 399
ODPI [00423] 2019-05-21 14:58:06.086: fn start dpiVar_release(0x2805c50)
ODPI [00423] 2019-05-21 14:58:06.086: fn end dpiVar_release(0x2805c50) -> 0
ODPI [00423] 2019-05-21 14:58:06.086: fn start dpiStmt_close(0x25104e0)
ODPI [00423] 2019-05-21 14:58:06.086: fn end dpiStmt_close(0x25104e0) -> 0
ODPI [00423] 2019-05-21 14:58:06.086: fn start dpiStmt_release(0x25104e0)
ODPI [00423] 2019-05-21 14:58:06.086: fn end dpiStmt_release(0x25104e0) -> 0
ODPI [00423] 2019-05-21 14:58:06.086: fn start dpiConn_release(0x2859710)
ODPI [00423] 2019-05-21 14:58:06.094: fn end dpiConn_release(0x2859710) -> 0
anthony-tuininga commented 5 years ago

If you look at the time difference between the start of the call to dpiStmt_execute() and resulting failure, you'll note it is almost exactly 5 minutes (300 seconds). That suggests network configuration -- but you can run it several times to see if you consistently get the 300 seconds.

anthony-tuininga commented 5 years ago

And if the network is flaky in any way, working and not working in quick succession is quite normal. I am not a network guru, though! :-)

lrhazi commented 5 years ago

Really weird though:

I run my script like so: python3 -i -- main.py It fails, and get the interpreter prompt then I proceed to call my run_sql() function and it works, and it works repeatedly! exit python, and run it again, and it fails again, and so on, as if something is unique to that first run... except my funtion is like this:

def run_sql(sql):
    results = None
    db_string = get_db_string()
    try:
        db_conn = cx_Oracle.connect(db_string)
        cur = db_conn.cursor()
        cur.execute(sql)
        results = [x[0] for x in cur.fetchall()]
        cur.close()
        db_conn.close()
    except DatabaseError as e:
        logger.error(e)
    return results

Which means am starting over each time, right? why would restarting python bring the problem back?

anthony-tuininga commented 5 years ago

That is weird! What is the sql you are executing? Can you try something as simple as select 1 from dual?

lrhazi commented 5 years ago

yes, that simple sql seems to work fine! and my actual sql is in the debug log I pasted earleir. It's not too complicated and only returns couple hundered rows, when it works.

ODPI [01168] 2019-05-21 15:38:46.371: fn end dpiVar_getSizeInBytes(0x1ee4980) -> 0
ODPI [01168] 2019-05-21 15:38:46.371: fn start dpiStmt_define(0x1eca280)
ODPI [01168] 2019-05-21 15:38:46.371: fn end dpiStmt_define(0x1eca280) -> 0
ODPI [01168] 2019-05-21 15:38:46.372: fn start dpiStmt_fetchRows(0x1eca280)
ODPI [01168] 2019-05-21 15:38:46.372: fn end dpiStmt_fetchRows(0x1eca280) -> 0
DB: Got results: 1
ODPI [01168] 2019-05-21 15:38:46.372: fn start dpiVar_release(0x1ee4980)
ODPI [01168] 2019-05-21 15:38:46.372: fn end dpiVar_release(0x1ee4980) -> 0
ODPI [01168] 2019-05-21 15:38:46.372: fn start dpiStmt_close(0x1eca280)
ODPI [01168] 2019-05-21 15:38:46.372: fn end dpiStmt_close(0x1eca280) -> 0
ODPI [01168] 2019-05-21 15:38:46.372: fn start dpiStmt_release(0x1eca280)
anthony-tuininga commented 5 years ago

Hmm, I'm not sure what to say. @cjbj may be able to help further when he is awake...but in the meantime I'd suggest exploring the boundaries of the failure -- seeing whether the amount of time the execute takes is related to the failure in any way -- or the amount of data. It looks like your SQL contains references to other databases via database link? Perhaps try fetching a single row from a view that references a database link?

anthony-tuininga commented 5 years ago

That database link might explain things, though -- why it fails the first time (establishing the connection via the database link takes some time?) and works fine the rest of the times?

lrhazi commented 5 years ago

Great! thanks a lot. I will dig further.

lrhazi commented 5 years ago

Hmmm, I think my issue was a little extra space!!!

    PS_JOB @OPSUSER_CS_HR D

Thank you so much for the tips!