ibmdb / node-ibm_db

IBM DB2 and IBM Informix bindings for node
MIT License
188 stars 151 forks source link

PDF content error happens after inserted PDF BLOB to DB2 table #972

Closed cdlixiang closed 4 months ago

cdlixiang commented 6 months ago

Hi team, My application meet the problem below since we upgrade the ibm_db version from 2.8.1 to 3.2.3. My app has a feature that allows users to upload a PDF file for business purposes from a web page. The backend will then insert the PDF file as a BLOB into a business table in the db2 database. I and other authorized users can use the PDF file on our app to continue the business process. This time, because of a vulnerability warning, we upgraded ibm_db from 2.8.1 to 3.2.3, without changing anything else. Then we found that after the upgrade, the same PDF file uploaded by the app had a different size when downloaded (SQL select statement) than before, and the PDF file could not be displayed. The encoding of many parts of the PDF file was different from the previous PDF file. We found through debugging that the size of the same PDF file was the same before the insert statement was executed. But whether it was the 2.8.1 app or the 3.2.3 app, downloading the PDF file uploaded by 2.8.1 was normal, and downloading the PDF file uploaded by 3.2.3 was abnormal. So we think that when calling ibm_db 3.2.3 to insert, some module was different from before, causing the encoding of the PDF file to change. Can you help us confirm and solve this problem?

image image image
bimalkjha commented 6 months ago

@cdlixiang This change in behavior may be due to this commit of ibm_db@3.0.0 that expects binary data to be in a buffer. I would suggest to update database connection info in file node_modules/ibm_db/test/config.json file and then run below commands from terminal for testing:

cd ..../node_modules/ibm_db/test
node test-blob-file.js
node test-blob-insert.js

You can edit the test file test-blob-file.js and instead of /data/phool.jpg at line number 23, you can change it to /data/yourPDFFile.pdf; copy this file under test/data directory and then run the test program. If it works fine, then check your application code with code in this test file and update accordingly. Let me know the results. Thanks.

cdlixiang commented 5 months ago

Hi, @bimalkjha . Thank you for your reply and suggestion. I tried the test. And the results are list below. (The table creation drop could not be executed by this test program, which was done by other way) test % node test-blob-file.js drop skip create skip img1.length = 121125 text.length = 5 doc.length = 12205 Lengths after select = 121125, 5, 12205 done

test % node test-blob-insert.js drop skip create skip img1.length = 121125 text.length = 5 buffer data = IIȌs�<�9�5 skip drop table Lengths after select = 121125, 5 buffer after select : IIȌs�<�9�5

Test null value insert in BLOB coloum.

[ { ID: 'Aaa_123 ', DERP: null } ]

bimalkjha commented 5 months ago

@cdlixiang Have you used yourPDFFile.pdf in test-blob-insert.js as suggested in last update and verified the selected pdf file post insert that we get at end of test program? You can comment the code in test program that deletes the retrieved file. Do you see any difference in inserted pdf and selected pdf? The test output shows, size of the file inserted and selected is same. You complained about data corruption, so please compare the files and check for corruption. If both inserted and selected files using the test program is same, then update your application in the same way as test program and verify. Thanks.

cdlixiang commented 5 months ago

@bimalkjha As I tested last week by test-blob-insert.js to know that the PDFs before insert and after are the same. Then I compared the code with my app and test-blob-insert.js, no big difference when using prepare() and execute() to insert my PDF into table. So I checked the difference of the input, I found that I put my PDF into inputfile1 = 'data/BREPDF.JP.0177102.M1M0005020.20220106.C.E.PDF..P..SVC.PDF' as img, which is alright for my PDF to insert and select. Then I put my PDF into the buf to test(because in our app the PDF is converted to Buffer then insert to db table), the error happened as below. Could you pls give me more suggestion to find the solution?

node test-blob-insert.js drop skip create skip img1.length = 121125 text.length = 5 buffer data = 121125 skip drop table Lengths after select = 121125, 5 buffer after select : 16 node:assert:124 throw new AssertionError(obj); ^

AssertionError [ERR_ASSERTION]: Expected values to be strictly deep-equal:

Node.js v18.12.1

cdlixiang commented 5 months ago

Hi, @bimalkjha. I changed and tested the part of my app as in test-blob-insert.js. Before:buf = Buffer.from(bin); After :buf = bin; inserted into table, then selected the data back as before. The pdf became correct and displayed normally. I find some comments in the internet, which means before ibm_db use buffer object to display blob data, so we need to use Buffer.from to change binary data to buffer object for inserting. But since 3.0.0, the function of processing the blob data has changed, the ArrayBuffer and binary data could be processed, so Buffer object is not needed. So could you please confirm it for me, or it'll be difficult for me judge the impact if I change the code of my app.

bimalkjha commented 5 months ago

@cdlixiang Yes, above info is correct. since 3.0.0, the function of processing the blob data has changed, the ArrayBuffer and binary data could be processed, so Buffer object is not needed. Thanks.