Closed benli2023 closed 4 years ago
Following is the details of zhangwenzhuo's error case.
Hope above details can help to simulate the case.
We have upgraded the nodejs-idb-connector to latest V1.2.4. It won't crash now, but it will only display a character '-' for the whole narrative field. I believe it is due to the fix on issue #94. But can we have the output value just like what we use with STRSQL?
The field's value I see via STRSQL is ' DD01DEC2018 & SR ANNOUN'
Hello @Lemonleave,
The Hex value C4C4F0F1C4C5C3F2F0F1F8405040E2D940C1D5D5D6E4D5
is the EBCDIC encoded string of DD01DEC2018 & SR ANNOUN
.
The issue may be related to the first byte 0F
. In both EBCDIC and ASCII, it is shift in
or SI
. STRSQL displays it as a space
.
I will investigate how idb-connector handle this character.
@dmabupt,
Exactly, the issue is on the 0F byte. Thanks for the follow up.
The error details are --
SQLSTATE=01547
SQLCODE=191
Mixed data or UTF-8 data not properly formed.
idb-connector may handle this specifically.
Hello @Lemonleave ,
Would you try to set the environment variable DB2CCSID
to 0
? It makes idb-connector run under the IBM i job default CCSID rather than UTF-8(1208).
$ DB2CCSID=0 node basic.js
[{"COL935":"\u000fDD01DEC2018 & SR ANNOUN ","COL836":"\u000fDD01DEC2018 & SR ANNOUN "}]
The related code is https://github.com/IBM/nodejs-idb-connector/blob/master/src/db2ia/db2ia.cc#L20-L25
The JS string keeps the 0F
byte and display it as \u000f
.
A string containing the shift in
character is only valid if a shift out
character (0x0E
) has previously occurred in the same string. If the data starts with 0F
, the data is invalid and cannot be properly converted. I'm not sure what, if anything, should be done in the database connector. It is behaving correctly when run in 1208. One could fix the data or handle special cases in SQL if this is a common occurrence in the database.
@dmabupt, late reply due to the holiday season. This may be working if the DB connection job is also the biz logic server job. But in our system, we are using DB connection pool which is separated from the server job. The DB connection jobs are automatically submitted by system under sub-system QWORK. I will see if there is any way to add env parameter to our DB connection jobs.
@ThePrez, our user input the value by copying it from WORD docuement for some case. That is why it contains the shift in character. User not realizes that it contains a shift in/out or not. But I'm wondering how the RPG/SQLRPG/STRSQL handle this case, because they are all displaying an empty space for the single shift-in character. Is it due to the CCSID of my IBM-i login session?
@dmabupt I am able to set the ENV variable, it can return the sample value you provided. But setting CCSID to 0 means the connector won't do the UTF8 decode when the field contains non English characters. In my application, there are many files contain second language fields. So setting the zero CCSID is not working for us.
If you are taking non-validated input from the user, there is clearly potential for bad data in the database. It's a classic database problem, but completely understandable. This appears to "work" for green-screen interfaces simply because the data is not converted to a different CCSID.
I suspect the best answer here would involve the use of REGEXP_REPLACE
to strip out any data before an SI
without an SO
present
https://www.ibm.com/support/knowledgecenter/ssw_ibm_i_74/db2/rbafzscaregexp_replace.htm
Your search string would be something like: ^[^\x0E]*\x0F
(meaning any sequence of characters not containing an SO
followed by an SI
).
Curious if @forstie could lend extra insight.
The progression of the dialog helps, nice going. When DDS created physicals and Native DB access methods are used, there is the potential of inputting invalid data into the database. When SQL is used, the data must be valid for the data type of the target column, so this situation cannot occur.
Options include: 1) The invalid data could be made valid (as Jesse has noted) 2) The application(s) could change to prevent future/new instances of invalid data 3) An SQL view could be created over the troublesome physical file. The view could eliminate the column or transform the column. Then, the node app would always use the view.
I usually advise fixing problems at their source, but that requires investment, skill, and resolve. If that's not possible in this case, consider option 3, with the awareness that you're trading off performance to gain function.
hth...
@forstie @dmabupt @ThePrez we have tried below solution:
select using hex(column) to get the hex value for the invalid data
after get the hex value,and convert to byte array then invoke Java method to convert the specified ccsid 935, the final string look almost the same except the\u0f
became "?" symbol
com.ibm.mq.headers.Charsets.convert(byte[] bytes,int ccsid)
can we do the same in the idba driver ? ie borrow the conversion logic from java implementation ?
:wave: Hi! This issue has been marked stale due to inactivity. If no further activity occurs, it will automatically be closed.
we have existing old file and use RPG code can display correctly ,but when using this connector will cause the entire Node.js crash, after investigation, we found that if the filed contain some special character ,in our case is '\u000f' , the whole application will crash and the Node.js will not give any response to the browser .is it the memcp function cause the overflow and the Node.js crash ? because the length detection no longer work for this case