Open vmspike opened 1 month ago
@vmspike, thanks for report. Could you at least share the part of your config calling your lua script ? But a common error in TCP is to call it from a tcp-request content
rule without inspect delay configured.
By reading the code quickly, it seems to do what is documented. I have only one doubt about the offset. If it is smaller than the available data in the channel buffer, an error out of range
is returned. It seems a bit strange and rude. But I have no longer the context in mind.
However, at several places, you call txn:done()
. When this happens the session is closed. It don't know if it what you want to do.
However, I can confirm there is a way to have an empty string. It happens when offset 0 is set and a length is provided. So txn.req:data(0, 10)
for instance.
Thank you for taking a look to the issue! As for documentation I understood that data() should return either data, or wait for data till some timeout hits (or lua processing itself timed out), or return nill if no data was received during the wait and unable to receive more (e.g. if connection is terminated)
inspect delay is set.
txn:done()
is desired because such request does not pass certain checks.
This is the basic truncated config:
global
daemon
cpu-map 1/1- 0-
lua-load socks.lua
tune.lua.maxmem 2048
tune.lua.session-timeout 4s
<...>
defaults fe
mode tcp
tcp-request inspect-delay 4s
timeout client 4s
timeout client-fin 3s
option tcp-smart-accept
<...>
# Few hundreds of such frontends and related backends
backend some_tls_wrap_be from be-tls
server some_srv abns@some_socket send-proxy-v2
backend some_http_be from be-http
server <...>
backend some_socks_be from be-tcp
<...>
tcp-response content lua.socks_be
server <...>
frontend some_frontend from fe
bind "ipv4@${SOME_BIND_IP}:${SOME_BIND_PORT}" id 1
bind abns@some_socket accept-proxy ssl crt "${TLS_CERT}" alpn h2,http/1.1 id 2
acl TLS req.ssl_hello_type -m int 1
acl SO_PUBLIC so_id -m int 1
acl <...>
tcp-request connection <...>
tcp-request session <...>
tcp-request content <...>
tcp-request content accept if SO_PUBLIC TLS
tcp-request content <...>
tcp-request content switch-mode http if HTTP
tcp-request content lua.socks_fe
<...>
http-request <...>
<...>
use_backend some_tls_wrap_be if SO_PUBLIC TLS
use_backend some_http_be if HTTP
default_backend some_socks_be
It's hybrid tcp/http frontend for http/socks proxies traffic with optional tls-wrap.
As a side note it's a log-format issue for hybrid backends, but it's offtopic .
Channel.data used in different combinations, e.g.:
txn.req:data()
- ?
txn.req:data(0, 3)
- ?
txn.req:data(3)
- may return empty string
txn.req:data(3, 10)
- the most common case to get empty string as a result
but offset seems always related to already present data in buffer (e.g. the end of already handled data).
Also for no-lenth data (usually with offset set) when empty string returned (so #c
is 0), txn.sc:be2hex(c,'')
strangely returns 8-bytes zeroes:
Intermediate data (of size 0): 0000000000000000
Not sure it's related.
I suspect such empty string cases related mostly to quite distant src location requests (inter continents).
inspect delay is set.
txn:done()
is desired because such request does not pass certain checks.
Copy that. That was to be sure because it is a common mistake. However, it may be an oversight, but there is no inspect-delay on the response.
Channel.data used in different combinations, e.g.:
txn.req:data()
- ?txn.req:data(0, 3)
- ?txn.req:data(3)
- may return empty stringtxn.req:data(3, 10)
- the most common case to get empty string as a result but offset seems always related to already present data in buffer (e.g. the end of already handled data).
For the offset, at first glance, it seems a bit strange. The code is written this way. But I don't know if it is expected or not. I must be careful because these functions may also be called from a lua filter. So, I must check first before breaking other things. But there is indeed a bug about the empty string and I will fix it.
Also for no-lenth data (usually with offset set) when empty string returned (so
#c
is 0),txn.sc:be2hex(c,'')
strangely returns 8-bytes zeroes:
I must check, it seems to be a bug.
it may be an oversight, but there is no inspect-delay on the response
It's hidden in truncated part! Though it's set, but does not related to this report (I experience it purely on request/frontend part).
So, I have some answers.
About your strange output when you call txn.sc:be2hex()
, it happens when you call it with nil
and it seems expected. Only integers, booleans and strings are converted from lua to the corresponding internal sample type. All other lua types are interpreted as false
internally . Because be2hex
expects a binary on input, nil
, interpreted as false
, is converted to the binary value of 0 on 64 bits,
Then there is probably an issue with the offset usage when Channel.data()
is called. As said, I must try to remember why it was done this way. But it seems strange to return an error when some data may still be received and there is not enough data to respect the offset. It means the data length must be checked before, and it seems a bit painful. I suspect it is related to the lua filters.
About the documentation, depending how you read it, the function does what is documented. nil
is returned when there is no incoming data at all and no more data can be received anymore. Otherwise length
bytes of incoming data are returned, starting at the offset offseŧ
, or all available incoming data if the length
argument cannot be respected. So, this may be an empty string if there are exactly offset
bytes of incoming data. So txn.sc:data(3,10)
will return an empty string if only 3 bytes are received. The function does not return nil
because there are incoming data.
About your strange output when you call
txn.sc:be2hex()
, it happens when you call it withnil
and it seems expected.
Think it's different. I've seen such behavior in a place (see the code example) when c
is first checked for being nil, then checked for #c <3, then got that output (specifically printed tostring(#c)
on the same line which was printed as 0
), so c
was:
Suspect it's empty string then, if c
is nil #c
would lead to exception at least.
But this happened only for the case when length
is not provided, e.g. c = txn.req:data(3)
, not for c = txn.req:data(3,10)
.
About the documentation, depending how you read it, the function does what is documented.
nil
is returned when there is no incoming data at all and no more data can be received anymore. Otherwiselength
bytes of incoming data are returned, starting at the offsetoffseŧ
, or all available incoming data if thelength
argument cannot be respected. So, this may be an empty string if there are exactlyoffset
bytes of incoming data. Sotxn.sc:data(3,10)
will return an empty string if only 3 bytes are received. The function does not returnnil
because there are incoming data.
Clear, thanks!
So the function may return nil
only for 0 offset, when the channel has no data at all. When it returns empty string (or less than length
data) it also means that the channel cannot receive more data anymore (as with nil), otherwise it waits for more data right?
Better to clarify this in the docs then.
Think would be more comfortable to return string always then (even empty) instead of nil (if not keeping nil for historic reasons) - less checks for nil, uniformed return datatype.
About txn.sc:be2hex()
, I'm pretty sure. I've performed following tests:
core.log(core.info, 'TEST 1 : '..txn.sc:be2hex(true,'')) -- result: "TEST 1 : 0000000000000001
core.log(core.info, 'TEST 2 : '..txn.sc:be2hex(false,'')) -- result: "TEST 2 : 0000000000000000
core.log(core.info, 'TEST 3 : '..txn.sc:be2hex(nil,'')) -- result: "TEST 3 : 0000000000000000
core.log(core.info, 'TEST 4 : '..txn.sc:be2hex('','')) -- result "TEST 4 : "
core.log(core.info, 'TEST 5 : '..txn.sc:be2hex('1234','')) -- result "TEST 5 : 1234"
About the documentation, indeed, it may be improved. nil
value is returned when there is no data and it is a special value to know nothing was received, at the opposite of the empty string that just mean no data was extracted but something was received. And indeed, it happens when no more data can be received. But It is not the same than no data at all. Because at the end, the data returned may be smaller than expected, I guess you should already have a test on the string length if it is a prerequisite in your script. So an empty string is a smaller string. Not really a special case.
The API is probably not perfect but it cannot easily be modified. Some scripts may rely on the actual behavior. The backward compatibility is really important in that case. From time to time, there are breakage. But it must be truly necessary.
Note, that since the beginning we said nil
is returned when no more data can be received. But it may also be because the lua action can no longer wait for more data. For instance because the inspect delay was reached.
I will try to take a look to the offset parameter. I'm just busy on stable releases for now.
Detailed Description of the Problem
Based on https://discourse.haproxy.org/t/clarify-channel-data-behavior/10190/2 Related to #1993
Noticed that this function still returns empty string from time to time. Usually (always?) when the channel finally got closed (if explicitly checked by
txn.req:may_recv()
when empty string returned).Hope that this function is not the cause of connection closure...
Expected Behavior
Always return either nil or some data (wait for some data if none yet or less than
length
), not empty string. Or modify documentation accordingly to behavior.Steps to Reproduce the Behavior
Use a lot of requests with some tcp frontend action like this:
Originally tested on socks5 protocol handler...
Do you have any idea what may have caused this?
Suspect that might be related to a race after
channel_may_recv(chn)
.Do you have an idea how to solve the issue?
Update the docs. For user's scripts test for both nil and empty string equally.
What is your configuration?
Output of
haproxy -vv
Last Outputs and Backtraces
No response
Additional Information
No response