fukamachi / woo

A fast non-blocking HTTP server on top of libev
http://ultra.wikia.com/wiki/Woo_(kaiju)
MIT License
1.28k stars 97 forks source link

Parsing multipart/form-data with file upload fails #47

Closed thomp closed 8 years ago

thomp commented 8 years ago

Data in a POST with multipart/form-data seems to not be consistently handled in the manner one would expect.

(defvar *app* (make-instance 'ningle:<app>))

(setf (ningle:route *app* "/x" :method :post)
      #'(lambda (params)
          (declare (ignore params))
      "Hey!"))

(clack:clackup
 *app*
 :server :woo
 :port 8083
 :use-default-middlewares nil)

Trying to post with a file upload triggers "Callback Error: the message-complete callback failed".

echo 'testing123' > /tmp/x.txt
curl -F 'a="1234"' -F 'file0=@"/tmp/x.txt"' --trace-ascii /tmp/curl.out http://127.0.0.1:8083/x

I tried digging around a bit and it seems that perhaps the data provided to FAST-HTTP.PARSER's PARSE-REQUEST does not reflect the original content of the POST request. Compare the trace provided by curl and the data that is provided to PARSE-REQUEST:

The trace for curl:

== Info:   Trying 127.0.0.1...
== Info: Connected to 127.0.0.1 (127.0.0.1) port 8083 (#0)
=> Send header, 209 bytes (0xd1)
0000: POST /x HTTP/1.1
0012: Host: 127.0.0.1:8083
0028: User-Agent: curl/7.47.0
0041: Accept: */*
004e: Content-Length: 291
0063: Expect: 100-continue
0079: Content-Type: multipart/form-data; boundary=--------------------
00b9: ----929b14fd4ca5ce6b
00cf: 
== Info: Done waiting for 100-continue
=> Send data, 232 bytes (0xe8)
0000: --------------------------929b14fd4ca5ce6b
002c: Content-Disposition: form-data; name="a"
0056: 
0058: "1234"
0060: --------------------------929b14fd4ca5ce6b
008c: Content-Disposition: form-data; name="file0"; filename="x.txt"
00cc: Content-Type: text/plain
00e6: 
=> Send data, 11 bytes (0xb)
0000: testing123.
=> Send data, 48 bytes (0x30)
0000: 
0002: --------------------------929b14fd4ca5ce6b--
<= Recv header, 26 bytes (0x1a)
0000: HTTP/1.1 400 Bad Request
<= Recv header, 37 bytes (0x25)
0000: Date: Thu, 28 Apr 2016 15:38:59 GMT
<= Recv header, 19 bytes (0x13)
0000: Connection: close
<= Recv header, 20 bytes (0x14)
0000: Content-Length: 15
<= Recv header, 2 bytes (0x2)
0000: 
<= Recv data, 15 bytes (0xf)
0000: 400 Bad Request
== Info: Closing connection 0

The data that PARSE-REQUEST sees:

testing123

--------------------------929b14fd4ca5ce6b--
tion: form-data; name="a"

"1234"
--------------------------929b14fd4ca5ce6b
Content-Disposition: form-data; name="file0"; filename="x.txt"
Content-Type: text/plain

Apologies in advance if I'm missing something obvious...

knobo commented 8 years ago

I get something different. Which platform are you on? I suspect post multipart does not work on linux. Only Mac.

This is what HTTP-BODY.MULTIPART:MULTIPART-PARSE gets in the FLEXI-STREAMS::VECTOR-INPUT-STREAM

--------------------------2e4ec1523e4643ad
Content-Disposition: form-data; name=\"file0\"; filename=\"x.txt\"
Content-Type: text/plain

---------
--------------------------2e4ec1523e4643ad--
"

Output from curl trace

== Info: Trying 127.0.0.1... == Info: Connected to 127.0.0.1 (127.0.0.1) port 5000 (#0) => Send header, 209 bytes (0xd1) 0000: POST /x HTTP/1.1 0012: Host: 127.0.0.1:5000 0028: User-Agent: curl/7.43.0 0041: Accept: / 004e: Content-Length: 291 0063: Expect: 100-continue 0079: Content-Type: multipart/form-data; boundary=-------------------- 00b9: ----2e4ec1523e4643ad 00cf: == Info: Done waiting for 100-continue => Send data, 232 bytes (0xe8) 0000: --------------------------2e4ec1523e4643ad 002c: Content-Disposition: form-data; name="a" 0056: 0058: "1234" 0060: --------------------------2e4ec1523e4643ad 008c: Content-Disposition: form-data; name="file0"; filename="x.txt" 00cc: Content-Type: text/plain 00e6: => Send data, 11 bytes (0xb) 0000: testing123. => Send data, 48 bytes (0x30) 0000: 0002: --------------------------2e4ec1523e4643ad--

thomp commented 8 years ago

Indeed, this was on Linux (using SBCL).

knobo commented 8 years ago

Well, now I also tested on freebsd, and it should use the same backend as OS X, and I had some problems there too. So I was probably wrong about the Linux vs. not-Linux idea.

knobo commented 8 years ago

Very strange content is coming out of the socket inside libev..

I added some debug code to the reader callback

(define-c-callback tcp-read-cb :void ((evloop :pointer) (watcher :pointer) (events :int))
  (declare (ignore evloop events))
  (let* ((fd (io-fd watcher))
         (buffer-len (length *input-buffer*))
         (socket (deref-data-from-pointer fd))
         (read-cb (socket-read-cb socket)))
    (loop
      (let ((n (wsys:read fd (static-vectors:static-vector-pointer *input-buffer*) buffer-len)))
        (declare (dynamic-extent n))
        (case n
          (-1
           (let ((errno (wsys:errno)))
             (cond
               ((or (= errno wsys:EWOULDBLOCK)
                    (= errno wsys:EINTR)))
               ((or (= errno wsys:ECONNABORTED)
                    (= errno wsys:ECONNREFUSED)
                    (= errno wsys:ECONNRESET))
                (vom:error "Connection is already closed (Code: ~D)" errno)
                (close-socket socket))
               (t
                (vom:error "Unexpected error (Code: ~D)" errno)
                (close-socket socket))))
           (return))
          (0
           ;; EOF
           (setf (socket-last-activity socket) (lev:ev-now *evloop*))
           (close-socket socket)
           (return))
          (otherwise
           (setf (socket-last-activity socket) (lev:ev-now *evloop*))
       ;; Please check if there is more to read here, before continuing!
       (warn "-> N: ~a" n)

           (when read-cb
         (warn "Callback: ~s" (babel:octets-to-string *input-buffer* :end (position 0 *input-buffer*) ))
             (funcall (the function read-cb) socket *input-buffer* :start 0 :end n))
           (unless (= n buffer-len)
             (return))))))))

And this is what showed up in my repl:

WARNING: -> N: 209 WARNING: Callback: "POST /x HTTP/1.1 Host: 127.0.0.1:5000 User-Agent: curl/7.43.0 Accept: / Content-Length: 291 Expect: 100-continue Content-Type: multipart/form-data; boundary=------------------------7f0f25c620e7dbec

nt-Type: text/plain

testing123

--------------------------2139f55e0c544739-- " WARNING: -> N: 232 WARNING: Callback: "--------------------------7f0f25c620e7dbec Content-Disposition: form-data; name=\"a\"

\"1234\" --------------------------7f0f25c620e7dbec Content-Disposition: form-data; name=\"file0\"; filename=\"x.txt\" Content-Type: text/plain

testing123

--------------------------2139f55e0c544739-- " WARNING: -> N: 59 WARNING: Callback: "testing123

--------------------------7f0f25c620e7dbec-- tion: form-data; name=\"a\"

\"1234\" --------------------------7f0f25c620e7dbec Content-Disposition: form-data; name=\"file0\"; filename=\"x.txt\" Content-Type: text/plain

testing123

--------------------------2139f55e0c544739-- "

[12:05:20] woo - HTTP parse error: INVALID-HEADER-TOKEN: invalid character in header
knobo commented 8 years ago

@thomp Could you try to add some code to https://github.com/fukamachi/woo/blob/master/src/ev/tcp.lisp#L89 And see if it now works for you? It should now work for small POSTs, I think.

  (loop
       (progn
     (map-into *input-buffer* (constantly 0)) ;; lazy-coding clear buffer.
     (let ((n (wsys:read fd (static-vectors:static-vector-pointer *input-buffer*) buffer-len)))
      (declare (dynamic-extent n))
      (case n
thomp commented 8 years ago

I think you hit the nail on the head. It looks like clearing the buffer does the trick. The curl trace with your modification to TCP-READ-CB:

== Info:   Trying 127.0.0.1...
== Info: Connected to 127.0.0.1 (127.0.0.1) port 8083 (#0)
=> Send header, 209 bytes (0xd1)
0000: POST /x HTTP/1.1
0012: Host: 127.0.0.1:8083
0028: User-Agent: curl/7.47.0
0041: Accept: */*
004e: Content-Length: 291
0063: Expect: 100-continue
0079: Content-Type: multipart/form-data; boundary=--------------------
00b9: ----ea0903ad31898c85
00cf: 
== Info: Done waiting for 100-continue
=> Send data, 232 bytes (0xe8)
0000: --------------------------ea0903ad31898c85
002c: Content-Disposition: form-data; name="a"
0056: 
0058: "1234"
0060: --------------------------ea0903ad31898c85
008c: Content-Disposition: form-data; name="file0"; filename="x.txt"
00cc: Content-Type: text/plain
00e6: 
=> Send data, 11 bytes (0xb)
0000: testing123.
=> Send data, 48 bytes (0x30)
0000: 
0002: --------------------------ea0903ad31898c85--
<= Recv header, 17 bytes (0x11)
0000: HTTP/1.1 200 OK
<= Recv header, 37 bytes (0x25)
0000: Date: Thu, 26 May 2016 16:15:55 GMT
<= Recv header, 24 bytes (0x18)
0000: Connection: keep-alive
<= Recv header, 28 bytes (0x1c)
0000: Transfer-Encoding: chunked
<= Recv header, 2 bytes (0x2)
0000: 
<= Recv data, 14 bytes (0xe)
0000: 4
0003: Hey!
0009: 0
000c: 
== Info: Connection #0 to host 127.0.0.1 left intact
knobo commented 8 years ago

It makes the problem go a way, but it is not the right solution. I tried this instead: (wsys:memset (static-vectors:static-vector-pointer *input-buffer*) 0 buffer-len) And it did not solve the problem.

The I added (warn "After memset:" (babel:octets-to-string *input-buffer* :end (position 0 *input-buffer*) )) after the memset, and then it worked again.

So I guess it has to do with timing or something like that.

EDIT: Sorry, it probably still solves the issue here, but I'm working on a different problem also, which it does not solve.

Btw, the way to clear array with lisp is (fill *input-buffer* 0).

EDIT2: No, the curl example fails with "400 Bad Request" with use of the memset, but works again if I add (sleep 0.1).. But actually the memset is not necessary. The (sleep 0.1) is enough,

knobo commented 8 years ago

Looks like this is fixed by d589524 But @thomp should verify this and close the ticket.