Support parsing of multipart/byteranges

jcready commented 7 years ago

Currently formidable returns an empty body and a single item in the files object which is named 'null' and only contains the last part of the multipart byte-range body.

tunnckoCore commented 7 years ago

@felixge any help on that? Maybe https://github.com/jshttp/range-parser worth look?

felixge commented 7 years ago

I don't understand how this is supposed to work. @jcready can you provide an example request and what you'd expect formidable to do with it?

jcready commented 7 years ago

@felixge a multipart/byteranges response would look something like this (taken from RFC 7233):

HTTP/1.1 206 Partial Content
Date: Wed, 15 Nov 1995 06:25:24 GMT
Last-Modified: Wed, 15 Nov 1995 04:58:08 GMT
Content-Length: 1741
Content-Type: multipart/byteranges; boundary=THIS_STRING_SEPARATES

--THIS_STRING_SEPARATES
Content-Type: application/pdf
Content-Range: bytes 500-999/8000

...the first range...
--THIS_STRING_SEPARATES
Content-Type: application/pdf
Content-Range: bytes 7000-7999/8000

...the second range
--THIS_STRING_SEPARATES--

I imagine the parsed body would end up looking like this:

[
  {
    "headers": {
      "content-type": "application/pdf",
      "content-range": "bytes 500-999/8000"
    },
    "body": "...the first range..."
  },
  {
    "headers": {
      "content-type": "application/pdf",
      "content-range": "bytes 7000-7999/8000"
    },
    "body": "...the second range"
  }
]

Ideally this functionality would just be an additional parser like lib/byterange_parser.js where it would be used if the response's Content-Type started with multipart/byteranges. I've managed to get the desired output by overwriting formidable.IncomingForm.prototype.parse:

formidable.IncomingForm.prototype.parse = function parseByteRanges (res, done) {
  const parts = []
  let totalLength = 0
  res.on('error', done).on('data', (chunk) => {
    parts.push(chunk)
    totalLength += chunk.length
  }).on('end', () => {
    try {
      const boundary = res.headers['content-type'].match(/boundary=(.+)$/)[1]
      const body = Buffer.concat(parts, totalLength)
      done(null, parseMultipartBody(body.toString(), boundary))
    } catch (e) {
      done(e)
    }
  })
}

function parseMultipartBody (body, boundary) {
  return body.split('--' + boundary).reduce((memo, part) => {
    if (part && part !== '--') {
      const [ head, body ] = part.trim().split(/\r\n\r\n/g)
      memo.push({
        headers: head.split(/\r\n/).reduce((memo, header) => {
          const [ key, val ] = header.split(/:\s+/)
          memo[key.toLowerCase()] = val
          return memo
        }, {}),
        body: body
      })
    }
    return memo
  }, [])
}

tunnckoCore commented 7 years ago

@jcready hm, looks good. Can you try to PR and add some tests when you have time? We (the new maintainers) are here to help and review :)

felixge commented 7 years ago

@jcready makes sense. But I wonder if this should really be part of formidable. If yes, it should use the streaming multipart parser that's used for multipart/form-data and avoid unbound memory allocation. If nobody has time for a good implementation and tests, I'd rather not see this supported.

node-formidable / formidable

Support parsing of multipart/byteranges #390