freesoftwarefactory / parse-multipart

A javascript/nodejs multipart/form-data parser which operates on raw data.
MIT License
48 stars 79 forks source link

Buffer not convertible to image ? (yes it can) #8

Open rcfrias opened 6 years ago

rcfrias commented 6 years ago

Once the image file arrives in the form of: data: <Buffer 41 41 41 41 42 42 42 42 ... > It can not be directly converted to an image file. Is there a way to transform this data payload into a readable buffer?

christiansalazar commented 6 years ago

yes it can. currently is dealing (in my projects) with binary payloads.
Are you familiar with nodeJs buffers ? (when you log a buffer via console.log() it is striginfyied to <Buffer ...>)

take a look to this two videos:

  1. https://www.youtube.com/watch?v=BrYJlR0yRnw
  2. https://www.youtube.com/watch?v=JSq7JMZj3Ig
rcfrias commented 6 years ago

yes, I am currently testing locally with a node.js app. Here is the log how I am getting the body "server side":

body:
------WebKitFormBoundaryPxJ9lHXIKOZSM761
Content-Disposition: form-data; name="name"

test product
------WebKitFormBoundaryPxJ9lHXIKOZSM761
Content-Disposition: form-data; name="code"

product code
------WebKitFormBoundaryPxJ9lHXIKOZSM761
Content-Disposition: form-data; name="category"

product category
------WebKitFormBoundaryPxJ9lHXIKOZSM761
Content-Disposition: form-data; name="uploadFile"; filename="productImage.jpg"
Content-Type: image/jpeg

����JFIF��C

��C

        �X�"��

���}!1AQa"q2��#B��R��$3br�  
%&'()*456789:CDEFGHIJSTUVWXYZcdefghijstuvwxyz���������������������������������������������������������������������������
... (rest of the image)

thats on my body variable, then I do:

var boundary = multipart.getBoundary(contentType);
var bodyBuffer = new Buffer(body,'utf-8');
var parts = multipart.Parse(bodyBuffer,boundary);

for(var i=0;i<parts.length;i++){
        var part = parts[i];
        // console.log(part);
/*
this console.log(part) prints: 
{ name: 'name', data: 'test product' }
  { name: 'code', data: 'product code' }
  { name: 'category', data: 'product category' }
  { filename: 'productImage.jpg',
    type: 'image/jpeg',
    data: <Buffer ef bf bd ef bf bd ef bf bd ef bf bd 00 10 4a 46 49 46 00 01 01 00 00 01 00 01 00 00 ef bf bd ef bf bd 00 43 00 03 02 02 02 02 02 03 02 02 02 03 03 03 ... > }
*/

        if(part.filename != undefined && part.filename != null){
          // console.log("current image in data: ");
          /*
          console.log(part.data);
          <Buffer ef bf bd ef bf bd ef     <- I am using this buffer to get the image
          */

then I use part.data and try to save it somewhere:

let bufferToRead = part.data;

          var file = new File({   // btw. using file.js from Formidable's lib.
            path: "/the/path/to/uploads/dir/" + part.filename,
            name: part.filename,
            type: part.type
          });
          file.open();
          file.write(bufferToRead.toString("binary")); 
          /*
          also tried:
           file.write(bufferToRead); 
           file.write(bufferToRead.toString("base64")); 
           file.write(bufferToRead.toString("utf-8")); 
           */

Interesting, when I use

file.write(bufferToRead.toString("base64"));

The produced string displays: e+/vTnvv709ee+/ve+/vQDvv73vv73vv73vv73vv73vv73vv70UUUXvv73vv73vv70= at the end,

and if we transform the image to base64 before sending it, the string ends in: SPoOJAOOfb8qa7M38Oeo9eaKKAIiTk/Og9s0UUUWA/9k=

btw, I originally wanted to process the image before writing it somewhere or uploading it to S3, so I was trying:

var gmPromise = new Promise(function (resolve, reject){

      // resolve(buffer);
      // var base64Buffer = buffer.toString('base64');
      // console.log(base64Buffer);

      gm(buffer, 'image.jpg').quality(40)
        .toBuffer('JPEG', function(err, modBuffer) {
          if (err) {
              reject(err);
          } else {
              resolve(modBuffer);
          }
      });

    });

But graphics magick wasnt understanding the buffer I was using, so thats why I wanted to first write the unmodified file somewhere before troubleshooting Image Magick. :(

rcfrias commented 6 years ago

Ok, I found the error, it is on my side. By trying to simulate the AWS Lambda environment on my local machine, I was making some mistakes in my node code:

Since I am using json-server for this, I was handling my post request this way:

server.post('/products', (req, res) => { // Test parse-multipart const { headers } = req; const contentType = headers['content-type'];

 let body = [];
  req.on('error', (err) => {
    console.error(err);
  }).on('data', (chunk) => {
    body.push(chunk);
  }).on('end', () => {

At this point, if we log body, we would get something like this:

//  console.log("body:");
    //  console.log(body);

    /*
    body:
[ <Buffer 2d 2d 2d 2d 2d 2d 57 65 62 4b 69 74 46 6f 72 6d 42 6f 75 6e 64 61 72 79 42 66 4e 7a 4d 68 6d 72 4f 67 31 56 35 36 64 58 0d 0a 43 6f 6e 74 65 6e 74 2d ... >,
  <Buffer c8 0b 30 56 c3 23 b9 60 9c 0c 9c 1e 79 6c f7 23 8a e8 b4 ef 19 f8 77 c5 52 34 3a 26 a4 1e 48 63 52 e9 22 b4 6c 41 38 fb ad 82 47 b8 fd 2b 2f 5a 45 91 ... >,
  <Buffer 7d 29 61 2c c7 27 04 b9 ea 47 39 cd 47 27 71 39 74 3d 03 49 d5 2e af 26 96 f2 5b 9e 40 62 8b e8 71 fe 15 f3 bf c6 cd 65 af 75 56 87 73 32 12 df 79 71 ... >,
  <Buffer 6e 25 1a ac 25 a6 8d db ee 90 07 3f d2 bd e3 5a f1 56 89 e1 cb 58 ae b5 8b f8 e0 0e 00 00 9e 49 3e 83 bd 66 e3 66 6b 19 dd 5c dc 0c 7d 7e 9e 94 ee e7 ... >,
  <Buffer 59 3c a5 e4 62 af cb 3c 32 c6 50 b7 6c 8e f9 aa 11 1b 82 e6 27 5d b1 1e 14 91 d6 b5 e5 b9 27 97 78 b3 c2 17 31 c8 67 b0 8f 71 19 24 28 e9 5c 2c 96 5f ... >,
  <Buffer 62 df cb 69 24 70 ea 25 55 1c e0 4e 3e e9 f4 c8 ea a7 a7 5e 39 eb 52 c1 b3 51 b8 4b b7 3b ad e2 e2 dd 47 21 8f 77 fe 60 7b 64 f7 a2 8a 56 d0 7d 4b c7 ... > ]
*/

Then I was using this line of code to get the body Buffer in one variable like this: body = Buffer.concat(body) So now, if we log the body, we would see this:

// console.log("body:");
    // console.log(body);

    /*
    body:
    <Buffer 2d 2d 2d 2d 2d 2d 57 65 62 4b 69 74 46 6f 72 6d 42 6f 75 6e 64 61 72 79 41 56 62 37 6c 6f 56 50 53 38 6e 33 6e 4e 57 41 0d 0a 43 6f 6e 74 65 6e 74 2d ... >
    */

NOW THIS IS THE BUG, my error was that I was converting the Buffer to string like this: body = body.toString(); // <- don't do this Just to be able to inspect the payload in a readable manner:

    // console.log("body:");
    // console.log(body);

    /*
    body:
    ------WebKitFormBoundaryPxJ9lHXIKOZSM761
    Content-Disposition: form-data; name="name"

    test product
    ------WebKitFormBoundaryPxJ9lHXIKOZSM761
    Content-Disposition: form-data; name="code"

    product code
    ------WebKitFormBoundaryPxJ9lHXIKOZSM761
    Content-Disposition: form-data; name="category"

    product category
    ------WebKitFormBoundaryPxJ9lHXIKOZSM761
    Content-Disposition: form-data; name="uploadFile"; filename="productImage.jpg"
    Content-Type: image/jpeg

    ����JFIF��C
    ��C
            �X�"��
    (rest of the image) ...
    ...

where the parse-multipart is actually able to parse the Buffer without being a string.

    var boundary = multipart.getBoundary(contentType);
    //var bodyBuffer = new Buffer(body);  <- this is the second error, I was converting back to Buffer from the first toString conversion.
    var parts = multipart.Parse(body,boundary);

And now I can write the correct image this way:

for(var i=0;i<parts.length;i++){
        var part = parts[i];
        if(part.filename != undefined && part.filename != null){
          let bufferToRead = part.data;
          var file = new File({  // using file.js from Formidable's lib.
            path: "/the/path/to/uploads/dir/" + part.filename,
            name: part.filename,
            type: part.type
          });
          file.open();
          file.write(part.data);

*Before this bugfix, I was able to write a file but as a corrupted image on disk Now the image is saved correctly Conclusion, don't mess with the Buffer ^^**

christiansalazar commented 6 years ago

One more thing: Aws. It preprocess your binary payloads and take care of the conversion issues, so you should pass your binary payloads in base64 everytime.

christiansalazar commented 6 years ago

about your last message: "Conclusion, don't mess with the Buffer ^^", im sure is due to this:

One more thing: Aws. It preprocess your binary payloads and take care of the conversion issues, so you should pass your binary payloads in base64 everytime.

rcfrias commented 6 years ago

you should pass your binary payloads in base64 everytime.

yes, I am aware of it, thank you for the reminding. I skipped that step to make it easier to debug, and at the end I added more complexity that caused the bug! ^^ Anyways I am happy with the end result! It feels good to have a productive week after all!

rcfrias commented 6 years ago

Unfortunately by testing the "Lambda-Proxy integration", this returns to be an issue, since the payload is handled by AWS in the form of: (JSON.Stringified for testing purposes...) *Just to be clear, this is not the "Lambda integration" like the one in the video, for this option there is no setting for body mapping templates and binary support.

... (rest of the image)

In attempts to transform the body to another Buffer's format, I got the following results:

ascii =  data: <Buffer fd fd fd fd 00 10 4a 46 49 46 00 01 01
utf8 = data: <Buffer ef bf bd ef bf bd ef bf bd ef bf bd 00
utf16le = []
base64 = []
binary = data: <Buffer fd fd fd fd 00 10 4a 46 49 46 00 01 01 00 00 01 00 01 00 00 fd fd 
hex = []

After that I haven't been able to recover the image file inside the lambda... :(