Closed knobo closed 8 years ago
Now I have tried this on FreeBSD on a virtual machine on my laptop, and I have the same problem there. And FreeBSD uses the kqueue backend for libev as OS X does.
I tried with lev:+EVBACKEND-SELECT+
still same problem. Not sure if I used it in the right place.
EDIT: And now I also tried lev:+EVBACKEND-POLL+. Still the same.
;; File: woo-20160318-git/src/ev/event-loop.lisp
(defmacro with-event-loop ((&key enable-fork cleanup-fn) &body body)
`(let ((*evloop* (lev:ev-loop-new (+ lev:+EVBACKEND-SELECT+
(if ,enable-fork
lev:+EVFLAG-FORKCHECK+
0))))
(*callbacks* (make-hash-table :test 'eql))
(*data-registry* (make-hash-table :test 'eql))
(*input-buffer* (make-static-vector *buffer-size*)))
(unwind-protect (progn
,@body
(lev:ev-run *evloop* 0))
(let ((close-socket-fn (intern #.(string :close-socket) (find-package #.(string :woo.ev.socket)))))
(maphash (lambda (fd socket)
(declare (ignore fd))
(funcall close-socket-fn socket))
*data-registry*))
,@(when cleanup-fn
`((funcall ,cleanup-fn)))
(free-static-vector *input-buffer*)
(cffi:foreign-free *evloop*))))
I believe the fix is something like this:
--- woo-20160318-git/src/ev/tcp.lisp 2016-05-25 11:27:28.727523260 +0200
+++ woo-20160318-git/src/ev/tcp.lisp.fix 2016-05-25 11:27:26.799516499 +0200
@@ -110,7 +110,6 @@
(return))
(otherwise
(setf (socket-last-activity socket) (lev:ev-now *evloop*))
- ;; Please check if there is more to read here, before continuing!
(when read-cb
(funcall (the function read-cb) socket *input-buffer* :start 0 :end n))
(unless (= n buffer-len)
This improves the situation, But does not fix it. I can now upload buffer-size big file. So If I increase the buffer size, I can upload more. It also helps on https://github.com/fukamachi/woo/issues/47
(loop
(progn
(map-into *input-buffer* (constantly 0)) ;; lazy-coding clear buffer.
(let ((n (wsys:read fd (static-vectors:static-vector-pointer *input-buffer*) buffer-len)))
(declare (dynamic-extent n))
(case n
Hmm... It seems the request body is broken when the request is chunked and has a large body. I could reproduce it even with Mac.
Good to know it wasn't just me and my code that was broken ;) Is this something you intend to fix?
Probably needs more time to fix it. I still don't get what the exact reason to reproduce it. It doesn't occur when the request body is a big ASCII string.
I wonder, what is the mechanism for handling a request that is bigger then woo.ev.event-loop:*buffer-size*
. When the parser is called with the first part of the requests, the parser stops halfway somewhere inside a request, and then the rest of the request is read with wsys:read
, and the parser is calld again, then how does the parser know that it is suppose to continue parsing a header value or a body value?
Is it a state that is saved in the parser somewhere?
Sorry for keep you waiting. I suppose d589524de4c1bcb50bef520df89d96f1ca983a34 has fixed this issue and it's already in the master branch. Please check out the latest Woo.
It woks better now yes. But not every time. I open up my upload-file test project, and play with some large values, and sometimes (by random) I get this error. This time I uploaded 1 MB, and just clicked repeatedly on the input box, and suddenly the error appears. It is a different issue I guess.
Callback Error: the body callback failed The bounding indices 0 and 869656 are bad for a sequence of length 65536. [Condition of type FAST-HTTP.ERROR:CB-BODY]
Restarts: 0: [RETRY] Retry SLIME REPL evaluation request. 1: [*ABORT] Return to SLIME's top level. 2: [ABORT] abort thread (#<THREAD "repl-thread" RUNNING {1004E30033}>)
Backtrace:
0: ((FLET #:H0 :IN FAST-HTTP.PARSER::READ-BODY-DATA) #
Here is a bit more detailed debug information. Is there anything more I can do to provide more information? Are you able to reproduce this error?
Looks like this INDICES-BAD-ERROR occurs right at the new code.
0: (SB-INT:SEQUENCE-BOUNDING-INDICES-BAD-ERROR #(120 120 120 120 120 120 ...) 0 9901384)
1: (SB-KERNEL:VECTOR-SUBSEQ* #<unavailable argument> #<unavailable argument> #<unavailable argument>)
2: ((LAMBDA (WOO::DATA WOO::START WOO::END) :IN WOO::SETUP-PARSER) #(120 120 120 120 120 120 ...) 0 9901384)
Locals:
SB-DEBUG::ARG-0 = #(120 120 120 120 120 120 ...)
SB-DEBUG::ARG-1 = 0
SB-DEBUG::ARG-2 = 9901384
3: (FAST-HTTP.PARSER::READ-BODY-DATA #S(FAST-HTTP.HTTP:HTTP-REQUEST :METHOD :POST :MAJOR-VERSION 1 :MINOR-VERSION 1 :STATUS 0 :CONTENT-LENGTH 0 :CHUNKED-P NIL ...) #S(FAST-HTTP.PARSER:CALLBACKS :MESSAGE-..
Locals:
BODY-END = 9901384
#:CALLBACK0 = #<CLOSURE (LAMBDA (FAST-HTTP.HTTP:HTTP FAST-HTTP::DATA FAST-HTTP::START FAST-HTTP::END) :IN FAST-HTTP:MAKE-PARSER) {10082F540B}>
CALLBACKS = #S(FAST-HTTP.PARSER:CALLBACKS ..)
DATA = #(120 120 120 120 120 120 ...)
END = 4294967295
HTTP = #S(FAST-HTTP.HTTP:HTTP-REQUEST :METHOD :POST :MAJOR-VERSION 1 :MINOR-VERSION 1 :STATUS 0 :CONTENT-LENGTH 0 :CHUNKED-P NIL ...)
READABLE-COUNT = 4294967295
START = 0
Why aren't the requests the same size every time?
;;; Changed code too look inside the request
(defun read-body-data (http callbacks data start end)
(declare (type http http)
(type simple-byte-vector data)
(type pointer start end))
(let ((readable-count (the pointer (- end start))))
(declare (dynamic-extent readable-count)
(type pointer readable-count))
(if (<= (http-content-length http) readable-count)
(let ((body-end (+ start (http-content-length http))))
(declare (dynamic-extent body-end))
(warn "body length: ~a ~a ~a ~a" body-end start (http-content-length http) readable-count)
...
...
...
Request size: 32499. I get no error from this. I get the output (One line for every click):
WARNING: body length: 460 0 460 460
WARNING: body length: 460 0 460 460
WARNING: body length: 460 0 460 460
WARNING: body length: 463 0 463 463
WARNING: body length: 33231 518 32713 32713
WARNING: body length: 463 0 463 463
WARNING: body length: 460 0 460 460
But when request size is larger (1000000): This time the error came on the second upload.
Woo server is going to start.
Listening on localhost:5050.
WARNING: body length: 17694 0 17694 17694
WARNING: body length: 902430 0 902430 4294967295
EDIT: The end value is unreasonable large, don't you think?
Is there something wrong with signed/unsigned value of N in WOO.EV.TCP::TCP-READ-CB ?
0: (FAST-HTTP.PARSER::READ-BODY-DATA #S(FAST-HTTP.HTTP:HTTP-REQUEST :METHOD :POST :MAJOR-VERSION 1 :MINOR-VERSION 1 :STATUS 0 :CONTENT-LENGTH 771355 :CHUNKED-P NIL ...) #S(FAST-HTTP.PARSER:CALLBACKS :MES..
1: (FAST-HTTP.PARSER::PARSE-BODY #S(FAST-HTTP.HTTP:HTTP-REQUEST :METHOD :POST :MAJOR-VERSION 1 :MINOR-VERSION 1 :STATUS 0 :CONTENT-LENGTH 771355 :CHUNKED-P NIL ...) #S(FAST-HTTP.PARSER:CALLBACKS :MESSAGE..
2: (FAST-HTTP.PARSER:PARSE-REQUEST #S(FAST-HTTP.HTTP:HTTP-REQUEST :METHOD :POST :MAJOR-VERSION 1 :MINOR-VERSION 1 :STATUS 0 :CONTENT-LENGTH 771355 :CHUNKED-P NIL ...) #S(FAST-HTTP.PARSER:CALLBACKS :MESSA..
3: ((LAMBDA (FAST-HTTP::DATA &KEY :START :END) :IN FAST-HTTP:MAKE-PARSER) #(120 120 120 120 120 120 ...) :START #<unavailable argument> :END #<unavailable argument>)
4: (WOO::READ-CB #S(WOO.EV.SOCKET:SOCKET :WATCHERS #(#.(SB-SYS:INT-SAP #X7FFFE400BA00) #.(SB-SYS:INT-SAP #X7FFFE400BA40) #.(SB-SYS:INT-SAP #X7FFFE400C1F0)) :LAST-ACTIVITY 1.4661329746566706d9 :FD 61 :REM..
Locals:
#:.DEFAULTING-TEMP. = 0
#:.DEFAULTING-TEMP.#1 = 4294967295
DATA = #(120 120 120 120 120 120 ...)
#:N-SUPPLIED-274 = 1
SOCKET = #S(WOO.EV.SOCKET:SOCKET ..)
5: (WOO.EV.TCP::TCP-READ-CB #<unused argument> #.(SB-SYS:INT-SAP #X7FFFE400BA00) #<unused argument>)
Locals:
BUFFER-LEN = 65536
FD = 61
N = 4294967295
READ-CB = #<FUNCTION WOO::READ-CB>
SOCKET = #S(WOO.EV.SOCKET:SOCKET ..)
WATCHER = #.(SB-SYS:INT-SAP #X7FFFE400BA00)
When I change read to return a ssize-t
(defcfun ("read") ssize-t
(fd :int)
(buf :pointer)
(count :unsigned-int))
I sometimes get:
<ERROR> [05:20:41] woo.ev.tcp - Unexpected error (Code: 11)
http://www-numi.fnal.gov/offline_software/srt_public_context/WebDocs/Errors/unix_system_errors.html
Says 11 means "Try again". Is that right?
Looks like this fixed it: ` ;;; woo/src/syscall/main.lisp (defconstant EAGAIN 11.)
`
(define-c-callback tcp-read-cb :void ((evloop :pointer) (watcher :pointer) (events :int))
(declare (ignore evloop events))
(let* ((fd (io-fd watcher))
(buffer-len (length *input-buffer*))
(socket (deref-data-from-pointer fd))
(read-cb (socket-read-cb socket)))
(loop
(let ((n (wsys:read fd (static-vectors:static-vector-pointer *input-buffer*) buffer-len)))
(declare (dynamic-extent n))
(case n
(-1
(let ((errno (wsys:errno)))
(cond
((or (= errno wsys:EWOULDBLOCK)
(= errno wsys:EINTR)))
((or (= errno wsys:ECONNABORTED)
(= errno wsys:ECONNREFUSED)
(= errno wsys:ECONNRESET))
(vom:error "Connection is already closed (Code: ~D)" errno)
(close-socket socket))
((= errno wsys::eagain)
(warn "retrying"))
(t
(vom:error "Unexpected error (Code: ~D)" errno)
(close-socket socket))))
(return))
(0
;; EOF
diff --git a/src/ev/tcp.lisp b/src/ev/tcp.lisp
index 7a7c654..7d96a39 100644
--- a/src/ev/tcp.lisp
+++ b/src/ev/tcp.lisp
@@ -99,6 +99,9 @@
(= errno wsys:ECONNRESET))
(vom:error "Connection is already closed (Code: ~D)" errno)
(close-socket socket))
+ ((= errno wsys::eagain)
+ ;; Jus to nothing
+ )
(t
(vom:error "Unexpected error (Code: ~D)" errno)
(close-socket socket))))
diff --git a/src/syscall/main.lisp b/src/syscall/main.lisp
index 35f59a8..6387e68 100644
--- a/src/syscall/main.lisp
+++ b/src/syscall/main.lisp
@@ -17,7 +17,7 @@
(buf :pointer)
(count size-t))
-(defcfun ("read") :unsigned-int
+(defcfun ("read") ssize-t
(fd :int)
(buf :pointer)
(count :unsigned-int))
@@ -51,6 +51,8 @@
(defconstant ECONNREFUSED 61.)
(defconstant ECONNRESET 54.)
(defconstant ENOTCONN 57.)
+(defconstant EAGAIN 11.)
+
(defun set-fd-nonblock (fd enabled)
(declare (optimize (speed 3) (safety 0)))
This bug is only on Linux and not OS X.
Use program from this bugreport to reproduce. https://github.com/fukamachi/fast-http/issues/27