SAP / PyRFC

Asynchronous, non-blocking SAP NW RFC SDK bindings for Python
http://sap.github.io/PyRFC
Apache License 2.0
510 stars 137 forks source link

Character Decoding Errors #84

Closed bschwab003 closed 5 years ago

bschwab003 commented 6 years ago

This is a follow up to the comment (https://github.com/SAP/PyRFC/issues/65) previously posted. Same scenario, but have a bit more information. We are receiving the following error when making an RFC_READ_TABLE2 function module call to pull data out of SAP using PyRFC 1.9.93. The error we are getting is.....

wrapString uclen: 512 utf8_size: 1537

We were able to dig further and identify the table (BSEG), field (SGTXT), and row causing the problem and looked up that record in the SAP GUI. Based on how GUI is representing some character(s) in that field, it looks like the UI cannot display them, replacing them with "square boxes" and/or the pound/hash symbol. This leads me to believe that the SGTXT field contains characters that cannot be decoded using UTF8.

I tried my hardest to upload a screenshot but nothing I did seemed to work. I can provide the screenshot via another means if possible.

My guess is that PyRFC is just throwing an error because it cannot decode the character(s) found in the field. Is it possible to "ignore" decoding errors when they arise, skipping the byte(s) that cannot be decoded or replace those with dummy characters? Perhaps this can be an option you can set pyrfc.Connection.call() ?

I know its completely unrelated, but, for example, cx_Oracle recently implemented a similar option in their most recent release to deal with failed decodings due to bad/weird characters, giving the user the option to "ignore" the error. Their prior versions also had the same problem of failing on bad/non-unicode characters.

Thanks,

bsrdjan commented 6 years ago

Please consider that using RFC_READ_TABLE can lead to issues with Unicode and that using this function module, with or without Python is not recommended: https://stackoverflow.com/questions/51159513/sap-rfc-read-table-not-returning-all-digits

Adding configuration flag to PyRFC is technically possible but each such flag adds additional overhead and requires design considerations, like enable it at connection/call level, or parameter/structure/field level and so on. Doing this only to overcome the RFC_READ_TABLE limitations would not be the right way to address this issue, which does not happen with other ABAP function modules.

To avoid such issues, I would recommend using standard ABAP function modules and BAPIs.

bschwab003 commented 6 years ago

I am very familiar with the limitations of the RFC_READ_TABLE function module including the one in this stackoverflow post.

It is important to note (but I am not sure if it matters) that we are using BODS/RFC_READ_TABLE2 which is completely different from RFC_READ_TABLE. This function module addresses those limitations including the one you referenced in the stackoverflow post.

I will admit I am not an expert in C programming but it looks like the error we are getting is coming from the following source code in /src/pyrfc/_pyrfc.pyx.

Could a fix be made here to handle the error differently?

cdef wrapString(SAP_UC* uc, uclen=-1, rstrip=False):
    cdef RFC_RC rc
    cdef RFC_ERROR_INFO errorInfo
    if uclen == -1:
        uclen = strlenU(uc)
    if uclen == 0:
        return ''
    cdef unsigned utf8_size = uclen * 3 + 1
    cdef char *utf8 = <char*> malloc(utf8_size)
    utf8[0] = '\0'
    cdef unsigned result_len = 0
    rc = RfcSAPUCToUTF8(uc, uclen, <RFC_BYTE*> utf8, &utf8_size, &result_len, &errorInfo)
    if rc != RFC_OK:
        # raise wrapError(&errorInfo)
        raise RFCError('wrapString uclen: %u utf8_size: %u' % (uclen, utf8_size))
    try:
        if rstrip:
            return utf8.rstrip().decode('UTF-8')
        else:
            return utf8.decode('UTF-8')
    finally:
        free(utf8)
bsrdjan commented 6 years ago

Unfortunately I don't have RFC_READ_TABLE2 installed in test systems and could not find SAP notes describing what it does and its restrictions. Could you perhaps send me OSS note(s), like 2503119 - Information about function module RFC_READ_TABLE or 382318 - FAQ|Function module RFC_READ_TABLE?

I suppose the error is caused by sending non-unicode characters or bytestrings as RFC_READ_TABLE2 strings.

You do simply comment raising the exception line or try increasing the conversion buffer uclen, as shown below, and build PyRFC from source.

    # >> try 4 or more here, like uclen * 4 + 1 and so on
    cdef unsigned utf8_size = uclen * 3 + 1

    cdef char *utf8 = <char*> malloc(utf8_size)
    utf8[0] = '\0'
    cdef unsigned result_len = 0
    rc = RfcSAPUCToUTF8(uc, uclen, <RFC_BYTE*> utf8, &utf8_size, &result_len, &errorInfo)
    if rc != RFC_OK:
        # raise wrapError(&errorInfo)
        # raise RFCError('wrapString uclen: %u utf8_size: %u' % (uclen, utf8_size))
        # >> extend the exception message
        raise RFCError('wrapString uclen: %u utf8_size: %u result_len: %u' % (uclen, utf8_size, result_len))
bschwab003 commented 6 years ago

This link should allow you to install the function module without Business Objects Data Services.

https://launchpad.support.sap.com/#/notes/1916294

bsrdjan commented 5 years ago

Did you try adding more info to exception, like:

raise RFCError('wrapString uclen: %u utf8_size: %u result_len: %u' % (uclen, utf8_size, result_len))
bsrdjan commented 5 years ago

Were above answers helpful and are there any other open questions or we could close the issue?