oven-sh / bun

Incredibly fast JavaScript runtime, bundler, test runner, and package manager – all in one
https://bun.sh
Other
74.13k stars 2.76k forks source link

Requests with Transfer-Encoding: chunked seems to be broken in Bun but not in Node #11621

Open shadone opened 5 months ago

shadone commented 5 months ago

What version of Bun is running?

1.1.12

What platform is your computer?

Darwin 23.5.0 arm64 arm

What steps can reproduce the bug?

  1. create index.js
    
    const express = require('express')
    const bodyParser = require('body-parser')
    const multer = require('multer')

const app = express() app.use(bodyParser.json())

const upload = multer({ storage: multer.memoryStorage() })

function handleRegistrations(req, res, next) { const product_id = req.body.product_id console.log(### handleRegistrations: product_id=${product_id}) res.status(200).json({ product_id }) }

app.post( "/registrations", upload.single("image"), handleRegistrations, )

app.post("*", (req, res, next) => { console.log("Unhandled endpoint", req) res.status(404).send() })

console.log("Listening on port 8080") app.listen(8080)


2. run the server with `bun ./index.js`

3. make the request with `Transfer-Encoding: chunked` like this: `curl -XPOST http://localhost:8080/registrations -F product_id=1234foobar567 -H 'Transfer-Encoding: chunked'`

The server spits the following error:

Error: Unexpected end of form at _final (/Users/denis/tmp/node-multipart-form-data/node_modules/busboy/lib/types/multipart.js:588:13) at callFinal (node:stream:2778:23) at prefinish (node:stream:2801:23) at finishMaybe (node:stream:2808:25) at (node:stream:2741:76) at onend (node:stream:2088:21) at emit (native) at endReadableNT (node:stream:2396:27) at processTicksAndRejections (:12:39)



### What is the expected behavior?

Run the same server with node instead of bun, no errors occur.

### What do you see instead?

_No response_

### Additional information

_No response_
shadone commented 5 months ago

I believe this could be the same or related issue as the following: https://github.com/oven-sh/bun/issues/4139 https://github.com/oven-sh/bun/issues/9100 https://github.com/oven-sh/bun/issues/4234 https://github.com/oven-sh/bun/issues/8638

guest271314 commented 5 months ago

What happens when Bun.listen() is used to handle Transfer-Encoding: chunked?

shadone commented 5 months ago

What happens when Bun.listen() is used to handle Transfer-Encoding: chunked?

I am not sure what you would like me to test with Bun.listen() - that seems to give raw tcp socket data which I wouldn't know how to read from Bun/TS code

However I tried Bun.serve() instead, maybe that will be helpful?

async function streamToString(stream) {
  const chunks = [];

  for await (const chunk of stream) {
    chunks.push(Buffer.from(chunk));
  }

  return Buffer.concat(chunks).toString("utf-8");
}

Bun.serve({
  port: 8080,
  async fetch(req) {
    console.log("req:")
    console.log(req);

    const result = await streamToString(req.body)
    console.log("body:")
    console.log(result);

    return new Response("Bun!");
  },
});

I sent this request with curl: curl -XPOST http://localhost:8080/registrations -F product_id=1234foobar567 -H 'Transfer-Encoding: chunked'

server logs:

$ bun ./index.js
req:
Request (0 KB) {
  method: "POST",
  url: "http://localhost:8080/registrations",
  headers: Headers {
    "host": "localhost:8080",
    "user-agent": "curl/8.6.0",
    "accept": "*/*",
    "transfer-encoding": "chunked",
    "content-type": "multipart/form-data; boundary=------------------------VA4meCsV7czVKCTz7jQ3NZ",
  }
}
body:
--------------------------VA4meCsV7czVKCTz7jQ3NZ
Content-Disposition: form-data; name="product_id"

1234foobar567
--------------------------VA4meCsV7czVKCTz7jQ3NZ--
guest271314 commented 5 months ago

What is broken?

body appears to be raw "multipart/form-data; boundary ...". Any kind of data can be sent using Transfer-Encoding: chunked, including multipart/form-data. To parse the raw multipart/form-data to a FormData object you can do something like this https://gist.github.com/guest271314/78372b8f3fabb1ecf95d492a028d10dd#file-createreadwritedirectoriesinbrowser-js-L465-L477

// ...
const boundary = body.slice(2, body.indexOf("\r\n"));
  return new Response(body, {
    headers: {
      "Content-Type": `multipart/form-data; boundary=${boundary}`,
    },
 })
.formData()
.then((data) => {
  console.log([...data]);
  return data;
}).catch((e) => {
  throw e;
});

Here's the start of using Bun.listen() TCP server to handle HTTP(S) requests and send Transfer-Encoding: chunked responses, which is basically the content length followed by the content.

#!/usr/bin/env -S bun run
import { listen } from "bun";

const encoder = new TextEncoder();
const decoder = new TextDecoder();

function encodeMessage(message) {
  return encoder.encode(JSON.stringify(message));
}

const config = {
  async data(socket, data) {
    console.log("data");
    const request = decoder.decode(data);
    console.log({ request });
    if (/^OPTIONS/.test(request)) {
      socket.write("HTTP/1.1 204 OK\r\n");
      socket.write(
        "Access-Control-Allow-Methods: OPTIONS,POST,GET,HEAD,QUERY\r\n",
      );
      socket.write("Access-Control-Allow-Origin: *\r\n");
      socket.write("Access-Control-Allow-Private-Network: true\r\n");
      socket.write(
        "Access-Control-Allow-Headers: Access-Control-Request-Private-Network\r\n\r\n"
      );
      socket.flush();
    }

    if (/^GET/i.test(request)) {
      console.log(request);
    }

    if (/^(POST|query)/i.test(request)) {
      const body = request.split("\r\n\r\n").pop();
      console.log(body);
      socket.write("HTTP/1.1 200 OK\r\n");
      socket.write("Content-Type: application/octet-stream\r\n");
      socket.write("Access-Control-Allow-Origin: *\r\n");
      socket.write("Access-Control-Allow-Private-Network: true\r\n");
      socket.write(
        "Access-Control-Allow-Headers: Access-Control-Request-Private-Network\r\n",
      );
      socket.write("Cache-Control: no-cache\r\n");
      socket.write("Transfer-Encoding: chunked\r\n\r\n");

      const url = new URL(body, import.meta.url);
      // const response = encoder.encode(body.toUpperCase().repeat(250));
      // socket.write("Content-Length: " + response.length + "\r\n\r\n");
      // socket.write(response);
      const response = await fetch(url.href);
      await response.body.pipeTo(
        new WritableStream({
          write(chunk) {
            const size = chunk.buffer.byteLength.toString(16);
            console.log(chunk.buffer.byteLength, size);
            socket.write(`${size}\r\n`);
            socket.write(chunk.buffer);
            socket.write("\r\n");
          },
          close() {
            console.log("Stream closed");
          },
        }),
      );
      /*
        for (let i = 0; i < response.length; i++) {
          const chunk = response.subarray(i, i + 1);
          socket.write(`${chunk.length}\r\n`);
          socket.write(chunk);
          socket.write("\r\n");
        }
        */
      socket.write("0\r\n");
      socket.write("\r\n");
      socket.flush();
    }
  },
  async open(socket) {
    console.log("open");
  },
  close(socket) {
    console.log("close");
    server.stop(true);
    server.unref();
    process.exit();
  },
  drain(socket) {
    console.log("drain");
    socket.reload({ socket: config });
  },
  error(socket, error) {
    console.log({ error });
  },

  tls: {
    // can be string, BunFile, TypedArray, Buffer, or array thereof
    key: Bun.file("./certificate.key"),
    cert: Bun.file("./certificate.pem"),
  },
};

const server = listen({
  alpnProtocol: "http/1.1",
  hostname: "0.0.0.0",
  port: 8443,
  socket: config,
});

const { hostname, port } = server;

console.log(
  `Listening on hostname: ${hostname}, port: ${port}`,
);

// server.stop(true);
// let Bun process exit even if server is still listening
// server.unref();
/*
fetch("http://127.0.0.1:8443", {
  method: "post",
  //duplex: "half",
  headers: {
    "Access-Control-Allow-Request-Network": true
  },
  body: "blade_runner.webm"
}).then((r) => {
  console.log(r);
  return r.body.pipeTo(
    new WritableStream({
      write(chunk) {
        console.log(chunk);
      },
      close() {
        console.log("Stream closed");
      }
    })
  );
}).catch(console.error);
*/

I continued the work here to include processing WebSocket connections in the same server code here https://github.com/guest271314/direct-sockets-http-ws-server/blob/main/assets/script.js which is a WebSocket and HTTP server run in Chromium or Chrome browsers using Direct Sockets API.

Here is the WebSocket code standalone that is JavaScript runtime agnostic https://gist.github.com/guest271314/735377527389f1de6145f0ac71ca1e86.

Using some logic from https://github.com/guest271314/webserver-c/tree/quickjs-webserver which is based on the explanations, and descriptions in Making a simple HTTP webserver in C, GitHub https://github.com/guest271314/webserver-c.

guest271314 commented 5 months ago

See Parse multi-part formdata in the browser.