kernelsauce / turbo

Turbo is a framework built for LuaJIT 2 to simplify the task of building fast and scalable network applications. It uses a event-driven, non-blocking, no thread design to deliver excellent performance and minimal footprint to high-load applications while also providing excellent support for embedded uses.
http://turbo.readthedocs.io/
Apache License 2.0
525 stars 84 forks source link

implemented streaming parsing multipart/form-data as turbo takes 3x memory of uploading file(s) size during parsing multipart/form-data request #367

Open gary8520 opened 11 months ago

gary8520 commented 11 months ago

isolated function which parse headers of multipart from parse_multipart_data() add kwargs.streaming_multipart_bytes in httpserver let user to parse multipart data in streaming and saved huge file (excess kwargs.large_body_bytes or 512 if no setting) to /tmp

the 3x memory of files as instance: file about 100M so the multipart/form-data body is about 100M

  1. in function iostream.IOStream:_read_to_buffer() self._read_buffer:append_right(ptr, sz) would expand the buffer size to 100M
  2. in function iostream.IOStream:_consume(loc) chunk = ffi.string(ptr + self._read_buffer_offset, loc) converting/coping c string to lua string takes 100M
  3. in function httputil.parse_multipart_data(data, boundary) argument[1] = data:sub(v1, b2) slicing the huge file content as new string takes 100M

the solution contains in the PR as below:

  1. streaming parsing will parse every chunk to prevent to expand buffer size
  2. parsing chunk with raw buffer struct to prevent to convert lua string
  3. saving the hug file under /tmp in chunk as in handler user can os.rename() the file under /tmp to where they want

Testing result:

$ luajit examples/multipart.lua

response ok and the file uploaded/received are identical

image

image