SAP / PyRFC

Asynchronous, non-blocking SAP NW RFC SDK bindings for Python
http://sap.github.io/PyRFC
Apache License 2.0
501 stars 133 forks source link

Could not convert from 8400 codepage to 4103 codepage #293

Closed scape7yu closed 1 year ago

scape7yu commented 1 year ago

pyrfc==2.7.0 SAP NWRFC SDK == nwrfc750P_6-70002755-win

from pyrfc import Connection

LANGUAGE = 'zh_CN'
CONN = Connection(ashost=ASHOST, sysnr=SYSNR, client=CLIENT, user=USER, passwd=PASSWD, lang=LANGUAGE)
CONN.call(func_name="XXX")  # error here

I try to get data from SAP, but comes error:

  File "src\pyrfc\_pyrfc.pyx", line 520, in pyrfc.pyrfc.Connection.call
  File "src\pyrfc\_pyrfc.pyx", line 352, in pyrfc.pyrfc.Connection._error
pyrfc._exception.ExternalRuntimeError: RFC_CONVERSION_FAILURE (rc=22): key=RFC_CONVERSION_FAILURE, message=Could not convert from 8400 codepage to 4103 codepage, rc = 512 [MSG: class=, type=, number=, v1-4:=;;;]

The SAP is in 'zh_CN', so we can't change LANGUAGE from 'zh_CN' to 'en_US'. Is PyRFC not compatible with 'zh_CN'?

bsrdjan commented 1 year ago

It looks like an issue related to backend data content, sent over RFC connection. To localise the data field causing the error, could you please create and upload here the data trace with level 4 (full) ? Just add the TRACE='4' to connection parameters and upload the file here.

Also update SAP NWRFC SDK to current release, PL10 because the older ones are not supported.

scape7yu commented 1 year ago

It looks like an issue related to backend data content, sent over RFC connection. To localise the data field causing the error, could you please create and upload here the data trace with level 4 (full) ? Just add the TRACE='4' to connection parameters and upload the file here.

Also update SAP NWRFC SDK to current release, PL10 because the older ones are not supported.

rfc21896_22184.trc.txt

Please look at it sometime. I deleted many rows of HEX data, because these data is private.

Ulrich-Schmidt commented 1 year ago

"I deleted many rows of HEX data, because these data is private"

Unfortunately, this is exactly the part I need to look at in order to find the problem... :-) Without it, the trace is worthless.

But let me explain, what the problem here seems to be, then perhaps you can try to find the cause yourself:

Usually the error we see here is caused by the following: in a non-Unicode SAP system (like in this case codepage 8400, Chinese) the CHAR datatype is byte-based, not char-based. ASCII characters are 1 byte, while Chinese characters are 2 bytes. This means, that if you write 5 Chinese characters into a CHAR10 field, the field is full, any following characters will be truncated. If a field now contains an odd number of ASCII characters, followed by some Chinese characters, a Chinese character will be truncated in the middle.

Example: writing 5 ASCII chars followed by 3 Chinese chars into a CHAR10 field would truncate the second byte of the last Chinese character. If we denote the bytes for the ASCII chars by XX and the bytes for the Chinese chars by YY, the following would happen: XX XX XX XX XX YYYY YYYY YYYY --- gets truncated to ---> XX XX XX XX XX YYYY YYYY YY

The SAP system then sends this data to the NW RFC Library, which then tries to convert it to Unicode. However, when the codepage converter encounters the last byte YY, it realizes that this is not an ASCII character (because it is in the range 0x80 - 0xFF, while ASCII chars are in the range 0x00 - 0x7F), but it does not know, what character this originally was, because the second half is missing!

At this point, it throws the error message we see above: "Could not convert from 8400 codepage to 4103 codepage, rc = 512"

By investigating the HEX dump in the trace, we would now be able to find out, which of the fields in function module Z_RFC_WLXT_GET_VENDOR contains the broken character. Then you could go into the backend and try to fix it (e.g. edit the field in the database to make it shorter, or distribute the value over two fields (or two lines of a table) or increase the size of the field in the DDIC, etc.).

However, I see, that your FM Z_RFC_WLXT_GET_VENDOR has only one parameter: a table named IT_EX_VENDOR, and if this table is bigger than 8 kB, it will be zipped, so you would first have to unzip the data of the HEX dump before you can read it and find the problematic field... For a "non-guru", this might be a bit difficult.

BTW: now that I think about it, the following might help as a workaround: if you add the additional connection parameter ON_CCE=2 Then the NW RFC Lib will replace the broken character with a # sign. If all characters are important, this is of course not a solution. But perhaps it at least allows you to find the problematic line/field of IT_EX_VENDOR by looking at the data returned by the Python program and searching for the # sign...

Oh, and by the way: the "best" solution to this problem would of course be to upgrade the SAP system to Unicode.... :-)

scape7yu commented 1 year ago

"I deleted many rows of HEX data, because these data is private"

Unfortunately, this is exactly the part I need to look at in order to find the problem... :-) Without it, the trace is worthless.

But let me explain, what the problem here seems to be, then perhaps you can try to find the cause yourself:

Thanks for your detailed and patient answer.

I found some dirty data cause this issue, and using on_cce='2' solved my problem indeed.

I tried something else:

from pyrfc import Connection

# LANGUAGE = 'zh_CN'
LANGUAGE = 'en_US'  # change from zh_CN to en_US

CONN = Connection(ashost=ASHOST, sysnr=SYSNR, client=CLIENT, user=USER, passwd=PASSWD, lang=LANGUAGE)

data = CONN.call(func_name="XXX")
str(data).encode("iso-8859-1").decode("gbk", "ignore")

Using lang='en_US' avoided exception on conversion, and then I decode to gbk with parameter 'ignore', 'ignore' will ignore dirty data, finally I got data in Chinese.

As you said, the best way is to upgrade SAP.