jprjr / lua-resty-exec

Run external programs in OpenResty without spawning a shell or blocking
MIT License
59 stars 15 forks source link

socket timeout error #16

Open jurafa opened 6 years ago

jurafa commented 6 years ago

Hi. This issue is already partially mentioned in here.

Behavior Certain requests from time to time and up with timeout error.

This code exits with 'timeout' in err variable to be specific.

    local prog = exec.new('/tmp/exec.sock')
    prog.argv = argv
    local res, err = prog()

Environment I am using openresty with lua-resty-exec and sockexec to expose outputs of some system calls as web service. Openresty is running on small devices like Raspberry Pi.

Cause At first, it seemed that the error appeares randomly but now it seems that it only appears when there are multiple shell commands executed in a single request. To be more specific, openresty receives one request, but to handle this request multiple shell commands have to be executed with lua-resty-exec. For each command new instance of exec is created and therefore new socket is created to handle each command also. Then the function grab_ns is called for each socket to receive data from the socket. The function is called in a loop until all data are read or an error occurs. For the first time, the function is called with the argument timeout=60000. In the subsequent calls the function is called with the argument timeout=0 that inside the function transforms into timeout=1 here.

This works fine if only one socket is active. What I think is happening here is, that if multiple sockets are active the worker might not handle the call self.sock:receive() immediately and might switch to different context with a different socket. When it then returns to continue the timeout (1 ms) is already expired. (Please note that in the documentation of tcpsock:receive here is set timeout=1000.)

When the socket timeouts in the nginx logs appears similar log like this:

2018/01/11 11:20:03 [error] 9582#0: *541745 lua tcp socket read timed out, client: *****, server: localhost, request: "POST /rpc HTTP/1.1", host: "*****", referrer: "http://localhost:4200/"

Suggested solution If I edit the timeout of subsequent calls to self.sock:receive() to for example timeout=100 the issue seems to be solved. (see patch.diff)

wanrui commented 5 years ago

哈哈