Closed alex-goodisman closed 3 years ago
Looks like you've done a big job of digging into this - that's appreciated.
What's the DB character set? What's the column type?
Are you able to identify and check a row of data that has this problem? Can you dump()
it to see if it is corrupted? Can you 'insert as select' that row into a test table for easier analysis?
Without a test case it will be tricky to dig deeper.
NLS_CHARACTERSET is AL32UTF8 The column type is CHAR(90), non-nullable, and char_used is 'B'
I have a specific row in the database that causes the problem. When I dump()
ed the contents of the data I noticed the following strange result:
The character 'ä' is represented in the dump
output as 228, despite the database being UTF-8 (which should have represented this character as two bytes, 195 164. I don't know how this value got into the database, as every time I attempted to cause it on purpose, it was converted to the two-byte version before it was entered into the CHAR column.
Regardless of how it got there, I think this explains the issue. If you have any ideas on how to reproduce this behavior separately, that will probably lead to a reproducing case.
It occurs to me that this could be happening on the database server - i.e. the database is converting it from 228 to 195 164 before sending it to the client and the data is now larger than the client expected it to be. How feasible is it to implement an option similar to the maxSize
used for OUT binds to be used for regular row output instead? There are currently no options in node-oracledb that allow the Node side to control this behavior.
Ah, you have corrupt data. This can be done easily enough by simply passing a set of bytes to the database that aren't actually properly encoded UTF-8 data. The database assumes the client knows what it is doing! For example, if you have this table:
create table issue_1383 (stringcol char(5));
and you run this Python code (which allows this but doesn't recommend it!):
with oracledb.connect("user/password@host/service_name") as conn:
with conn.cursor() as cursor:
var = cursor.var(str)
var.setvalue(0, b'\xe4')
cursor.execute("insert into issue_1383 values (:1)", [var])
conn.commit()
If you then attempt to retrieve this using SQL*Plus you will get the error ORA-29275: partial multibyte character
. If you tried something similar with node-oracledb you would get an error attempting to insert it, but other applications would be able to do something similar to Python if they were coded that way. Fetching this data doesn't give NJS-016: buffer is too small for out binds
but the data is obviously corrupted. I suspect the difference has to do with the other data that is stored along with the corrupt data? If you can do what I did above with Python and then use the Node.js script below to confirm that would be helpful. You can also use the above technique to create the data exactly as it is found in the offending row. If you do, please share so we can verify my understanding of the situation!
Since this is corrupt data, my recommendation would be to correct the data! If it recurs, try to find the application that is causing the corrupt data and have it fixed. :-)
There are limited ways to control output within node-oracledb. You can use fetchinfo to specify the type, as in this program:
const oracledb= require('oracledb');
const dbConfig = {
user: 'user',
password: 'password',
connectString : 'host/service_name'
};
async function main() {
const fetchInfo = {"STRINGCOL": {type: oracledb.STRING}};
const options = {fetchInfo: fetchInfo};
const conn = await oracledb.getConnection(dbConfig);
const results = await conn.execute("select StringCol as X from issue_1383", [], options);
console.log("results:", results);
await conn.close();
}
main();
So far you can only control the data type as we have not found a need to control the maximum length -- as we are trusting that the database information is accurate! If the data in the database is corrupted, though, all bets are off. I don't know if @cjbj will be interested in addressing this situation since it only occurs when corrupt data is being fetched -- but I'm sure he'll let you know!
Urrrgh - data corruption.
Database version: Oracle Database 18c Standard Edition 2 Release 18.0.0.0.0 - Production
Node version data:
Error
I am seeing a simple query (select COL1 from VIEW where COL2='value') fail with
NJS-016: buffer is too small for OUT binds
I am only seeing the error with certain database rows. For certain values of 'value' above, the query returns the expected result, but for others, it does not, and gives the error instead.
No binds of any kind (in, out, or in-out) are used for the query.
The data that I expect to be returned by the query is 30 characters long, but contains a single two-byte character, for a total of 31 bytes. However, the column COL1 that contains the data has a size of 90 so the data is not too long for its column even when taking byte length into account.
I could not reproduce this behavior using any other SQL Client, including SQL*Plus (running on the command line and entering the same query there) and the Python cx_Oracle library. Both correctly returned the expected output.
I also could not reproduce this behavior by recreating the data in a second database. When I created a new table in a different database with the same columns and data, it behaved as expected and did not give an error. Therefore, I cannot provide a reliable way to reproduce this.
I attempted running the same query with various versions of node-oracledb. I found that succeeded on version 1.13.1 (on Node 7) and earlier, and failed on any subsequent versions of node-oracledb starting from 2.0.14. As per the changelog, I expect it was the move to ODPI-C that caused this.
I can provide additional metadata about the database but as I am unable to reproduce this behavior on any other database I cannot provide a reliable method to reproduce it.
The database in question is running version
Oracle Database 18c Standard Edition 2 Release 18.0.0.0.0 - Production
The (anonymized) code that I ran to cause the error is this:
The result is shown: