SAP / node-rfc

Asynchronous, non-blocking SAP NW RFC SDK bindings for Node.js
Apache License 2.0
251 stars 73 forks source link

Issue regarding german special characters (ä, ö, ü, ß) #211

Closed mstangl88 closed 3 years ago

mstangl88 commented 3 years ago

Hello Srdjan, we are using your node-rfc interface for more than 5 years and are very content with it! In the last years we changed to node.js and are using your node-rfc 2.4 together with the NWRFC SAP SDK Binary 750 patch 6 (and ultimately Patch 8.)

node -v v12.19.1

head bin/dev_rfc.log

**** Log file opened at 2021-04-22 15:16:53.195345 UTC+02:00 (GMT), Encoding UTF-8 NW RFC Library: SDK variant, Release 750 Patch Level 6, Compiled on Feb 6 2020 09:04:28 CPIC library: 753.2019.10.14 version 3, NI library: 40, Kernel Release: 753 Patch Level 525 Current working directory: /usr/local/sap/nwrfcsdk/bin, Program: rfcexec Hardware: AMD/Intel x86_64 with Linux x86_64, Operating_system: Linux 4.18.0-193.28.1.el8_2.x86_64

We have a question regarding German special characters (ä, ö, ü, ß): If we send strings that contain theses special characters to SAP they appear correctly within the string, but at the end of the string appears a special character (#) that ist not intended, compare attached screen (TESTstraße 17#). If we send character strings not containing any of these characters, the string in the SAP system looks correct.

How can we get rid of that character?

Regards, Martin HE_specialchr

bsrdjan commented 3 years ago

Hi Martin,

bug in node-rfc sounds less likely because because unit tests pass with unicode characters, including German special characters.

The # char might be related to unicode settings, here couple of links:

https://github.com/SAP/PyRFC/issues/141

https://answers.sap.com/questions/12523482/problems-with-unicode-in-rfc.html

https://help.sap.com/doc/saphelp_nw74/7.4.16/en-us/48/8933fd84b84e6fe10000000a421937/content.htm

Could you please check these settings first ?

Kind regards, srdjan

mstangl88 commented 3 years ago

Hi Srdjan,

thank you for your fast reply! Your suggestions are plausible.

As we are not the owners of the SAP system we have to wait for colleagues to check and probably change ABAP settings.
We let you know about the results.

Best regards , Martin

bsrdjan commented 3 years ago

Unit tests send the unicode string to ABAP function module that echoes that same string back. That way conversions in both directions are tested, Node to ABAP and the other way around.

In my test system it works fine, with and without the \x01 byte in the middle of test string. You can run the same test in your system, to see if unicode settings make a difference.

I tested with node-rfc 2.4.1 and NWRFC SDK PL8.

const client = new addon.Client({ dest: "MME" });
const UC = "Hällö 哈洛温 \x01 ßürÖÄ อกครั้งที่";

(async () => {
    try {
        await client.open();

        //console.log(client.connectionInfo);
        const ECHOTEXT = (
            await client.call("STFC_CONNECTION", {
                REQUTEXT: UC,
            })
        ).ECHOTEXT;

        console.log(UC, UC.length);
        console.log(ECHOTEXT, ECHOTEXT.length);
        console.log(ECHOTEXT === UC);
    } catch (ex) {
        console.log(ex);
    }
})();
Ulrich-Schmidt commented 3 years ago

As it is only the last character, my gut feeling tells me there is a problem with a "terminating zero" accidentally been sent to ABAP... (In C, strings are zero-terminated, in ABAP they are not. If the client accidentally sends the trailing 0x00 byte into the SAP system, it will be displayed as # there.)

node-rfc uses a UTF-8 conversion routine of the NW RFC library. We had a bug recently in that conversion routine, which caused incorrect zero-termination of the converted strings. Perhaps this is related. The bug was fixed in PL8. So the first I would recommend: can you repeat the test with the latest NW RFC library patch level 8?

If we are lucky, that bugfix already solves the problem. If not, I would like to see an RFC trace of level 3 of the error situation, so I can see a) the exact data that the node-rfc layer passes down to the NW RFC library b) the data that the NW RFC library sends over the network to the ABAP side E.g. set the environment variable RFC_TRACE=3 and execute the node program that shows this symptom once. (The smaller the example, the smaller the trace will be and the easier we will find the error.)

mstangl88 commented 3 years ago

we tested with patch level 8, same behavior

** Log file opened at 2021-04-26 13:07:43.626274 UTC+02:00 (GMT), Encoding UTF-8 NW RFC Library: SDK variant, Release 750 Patch Level 8, Compiled on Apr 16 2021 23:03:51 CPIC library: 753.2020.03.230 version 3, NI library: 40, Kernel Release: 753 Patch Level 814** Current working directory: /opt/de.axelspringer.digitaleranstrich, Program: node Hardware: AMD/Intel x86_64 with Linux x86_64, Operating_system: Linux 3.10.0-1127.19.1.el7.x86_64 Hostname: tasit-da01v.linux.asinfra.net, IP address: 10.225.22.176, IPv6 address: fe80::250:56ff:fe85:29b7

bsrdjan commented 3 years ago

Did you try the option b) and if my test script works fine?

For option b) could you please attach the trace file and let us know the problematic string?

mstangl88 commented 3 years ago

I don't know where the trace file is written, can you tell path/name? Did not try your test script yet

mstangl88 commented 3 years ago

grep Musterstraße /opt/de.axelspringer.digitaleranstrich/rfc27660.trc FieldName: STRAS Value: Musterstraße 17 FieldName: STRAS Value: Musterstraße 17

rfc27660.trc.zip

mstangl88 commented 3 years ago

MicrosoftTeams-image At the Breakpoint at the entry of the abap function Musterstraße 17 has the trailing #, see screen

mstangl88 commented 3 years ago

Info about destination system (with client.connectionInfo). We don't know if we can change the code page of dest system and we don' t know how to change the code page


{
  dest: '\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000',
  host: 'tasit-da01v.linux.asinfra.net\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000',
  partnerHost: 'k10-ci.sap.asit.services\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000',
  sysNumber: '31',
  sysId: 'K10\u0000\u0000\u0000\u0000\u0000',
  client: '010',
  user: 'HO_DIGI_ANST',
  language: 'D\u0000',
  trace: '0',
  isoLanguage: 'DE',
  **codepage: '4103',**
  partnerCodepage: '4103',
  rfcRole: 'C',
  type: 'E',
  partnerType: '3',
  rel: '753\u0000',
  partnerRel: '750',
  kernelRel: '753',
  cpicConvId: '68532700',
  progName: 'SAPLSYST\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000',
  partnerBytesPerChar: '2',
  partnerSystemCodepage: '4103',
  partnerIP: '10.4.199.80\u0000\u0000\u0000\u0000',
  partnerIPv6: '10.4.199.80\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000'
}```
Ulrich-Schmidt commented 3 years ago

Ok, I found the problem: RfcSetString() is called with a length of 16, even though the string "Musterstraße 17" is only 15 characters long. As a result, the 16th character (which is a null-character) is written to the data buffer and causes the #-sign on ABAP side.

>> HEXDUMP
2021-04-29 14:24:57.536407 [139704082200448] RfcSetString(43395a0, "STRAS", pval=41d2710, len=16, errorInfo=7ffdaf130800) buffer:
000000 | 4D007500 73007400 65007200 73007400 |M.u.s.t.e.r.s.t.|
000010 | 72006100 DF006500 20003100 37000000 |r.a.ß.e. .1.7...|
<< HEXDUMP

Srdjan: can it be that node-rfc accidentally uses the "byte-length" of the original UTF-8 input instead of the "char-length"? In UTF-8, the ß-character has 2 bytes, so the byte-length of the UTF-8 string is 16, but after conversion to SAP_UC, it is only 15 characters. (RfcSetString requires the char-length!)

bsrdjan commented 3 years ago

Thanks Ulrich, is exactly the case. I confirm the node-rfc bug and got the same trace with STFC_CONNECTION test script and Musterstraße 17 input:

>> HEXDUMP
2021-04-29 15:07:02.963008 [4651625984] RfcSetString(7f873dd11c70, "REQUTEXT", pval=7f873dd11ea0, len=16, errorInfo=7ffedfd98a60) buffer:
000000 | 4D007500 73007400 65007200 73007400 |M.u.s.t.e.r.s.t.|
000010 | 72006100 DF006500 20003100 37000000 |r.a.ß.e. .1.7...|
<< HEXDUMP

JavaScript tests did not catch this because strings are zero terminated during C to JS conversion.

Affected were "only" RFCTYPE_CHAR and RFCTYPE_STRING. RFCTYPE_NUM, RFCTYPE_BCD, RFCTYPE_DECF16, RFCTYPE_DECF34 and RFCTYPE_FLOAT were fine.

After providing unicode len for CHAR and STRING the trace looks correct:

>> HEXDUMP
2021-04-29 16:05:16.717046 [4531088896] RfcSetString(7fa15e731680, "REQUTEXT", pval=7fa15e732930, len=15, errorInfo=7ffeeee90058) buffer:
000000 | 4D007500 73007400 65007200 73007400 |M.u.s.t.e.r.s.t.|
000010 | 72006100 DF006500 20003100 37000000 |r.a.ß.e. .1.7...|
<< HEXDUMP

Just pushed the fix and will publish the release latest tomorrow.

bsrdjan commented 3 years ago

Please try the 2.4.2 release with the fix included and re-open in case of any further issues.