Open dblessing opened 8 years ago
This version of net-ldap seems to ignore time
arg in the search
method. so this is a bug.
PR #243 which provides support for timeout has been merged. Now you should use the current master or the next version, 0.13.0 but no plan to release now.
@satoryu thanks for triaging. I've opened https://github.com/ruby-ldap/ruby-net-ldap/pull/248 to prepare the next release.
sorry, I misunderstood as I saw only Net::LDAP#search
.
the value of time
given on Net::LDAP#search
is used by Net::LDAP::Connection#search
and it seems that the arg should follow the specification defined in RFC 4511. But #243 would resolve @dblessing 's situation, I think.
@dblessing do you think you can confirm the latest version of net-ldap can resolve your situation? if not resolved, we need to keep this issue open, I think.
I just upgraded to 0.13.0 and tested this again. Neither connect_timeout
nor time
work for me. I need both a connect and a read timeout.
Here's a code example of how I'm setting connect timeout:
options = { :host=>"127.0.0.1", :port=>33890, :encryption=>nil, :connect_timeout=>5 }
Net::LDAP.open(options) do |ldap|
block.call(self.new(provider, ldap))
end
I'm using ToxiProxy to simulate a timeout while connecting to an LDAP server. I see that the attempt to open the connect hangs indefinitely and does not time out after the 5 seconds specified. This is the same behavior I found with time
on Net::LDAP#search
Please let me know how I can help troubleshoot this further.
@astratto could you take a look at @dblessing's transcript and see why timeout isn't being respected?
@jch I'll try to find some time to take a look at that in the next few days, sure.
Has there been any further information on this? We had an "event" last Friday due to a LDAP host being offline, and the v0.14.0 LDAP client hanging. It might have been waiting for a default TCP timeout to occur as the code doesn't set a timeout value.
Hey guys, sorry but I had to focus on other more urgent things and then it slipped my mind.
I've been using this code on different machines and also testing the timeout very often, so my guess is that you're hitting some corner cases we're not aware of.
@dblessing can you share with us the actual tests you're running?
I had a little bit of spare time today and here's what I've found so far. Note that these tests are using directly Ruby's Socket.tcp and don't involve net-ldap on purpose.
The test server is just an ugly and silly script:
require 'socket'
socketServer = TCPServer.open(8080)
while true
Thread.new(socketServer.accept) do |conn|
puts "Accepting connection from: #{conn.peeraddr[2]}"
conn.puts "Close"
conn.close
end
end
Disclaimer: First time I use it, tell me if I'm doing something stupid.
Using the timeout feature of Toxiproxy, I can still see the socket creating the connection and then it hangs. This is because the timeout is on socket creation, but after the socket is there, there's no read timeout applied (Socket.tcp doesn't support it).
Toxiproxy test - client side
require 'toxiproxy'
Toxiproxy.reset
Toxiproxy.populate([
{
name: "myservice",
listen: "127.0.0.1:22220",
upstream: "127.0.0.1:8080"
}
])
puts "#{Time.now} Testing no response and close connection after 5s with timeout at connect of 1s"
Toxiproxy[:myservice].downstream(:timeout, timeout: 5000).apply do
puts Socket.tcp('127.0.0.1', 22220, connect_timeout: 1).read
end
puts "#{Time.now} Done."
Result client-side:
$ ruby toxiproxy.rb
2016-05-13 17:39:49 +0100 Testing no response and close connection after 5s with timeout at connect of 1s
2016-05-13 17:39:54 +0100 Done.
Result server-side:
Accepting connection from: 127.0.0.1
You can see that the socket is actually created and it reaches both Toxiproxy and the server. As expected the timestamp with Done is after 5 seconds instead of 1.
Toxiproxy logs:
INFO[0563] Accepted client client=127.0.0.1:60321 name=myservice proxy=127.0.0.1:22220 upstream=127.0.0.1:8080
WARN[0568] Source terminated bytes=0 err=read tcp 127.0.0.1:22220->127.0.0.1:60321: use of closed network connection name=myservice
If, instead, I set a firewall rule to drop the packets and then actually simulate the inability of creating a socket, the code times out just fine.
I use OSX, so that's basically a:
$ # enable firewall
$ sudo pfctl -ef /etc/pf.conf
$ # drop packets on port 8080
$ (sudo pfctl -sr 2>/dev/null; echo "block drop quick proto tcp from any to any port = 8080") | sudo pfctl -f - 2>/dev/null
(Disable and flush your rules with)
sudo pfctl -df /etc/pf.conf
Result client-side
$ cat simple_socket.rb
require 'socket'
begin
puts "#{Time.now} Starting..."
Socket.tcp('127.0.0.1', 8080, connect_timeout: 1).read
rescue => e
puts "#{Time.now} Error: #{e}"
end
$ ruby simple_socket.rb
2016-05-13 17:59:56 +0100 Starting...
2016-05-13 17:59:57 +0100 Error: Operation timed out - user specified timeout
@astratto Thank you for testing. Sorry for not getting back sooner.
I wonder if it would be beneficial to look at something like https://github.com/lann/tcp-timeout-ruby. It would be a runtime dependency, but it's also small so the logic could potentially be pulled into this gem, too. It would give us a connect, read and write timeout.
If I get a chance, I'll do some testing to see if this is a drop-in replacement and if it solves the problem. If it is a drop-in replacement, maybe an alternative would be to allow users to pass the socket class all the way down? It's already an arg on the connection class. Just a thought...
I think I'm seeing this problem even during the LDAP open; connections to ldap.va.pki.gov (to fetch CRLs) are sometimes hanging here:
net-ldap-0.16.0/lib/net/ber/ber_parser.rb:169:in ‘getbyte’ net-ldap-0.16.0/lib/net/ber/ber_parser.rb:169:in ‘read_ber’ net-ldap-0.16.0/lib/net/ldap/connection.rb:234:in ‘block in read’ net-ldap-0.16.0/lib/net/ldap/instrumentation.rb:19:in ‘instrument’ net-ldap-0.16.0/lib/net/ldap/connection.rb:233:in ‘read’ net-ldap-0.16.0/lib/net/ldap/connection.rb:201:in ‘queued_read’ net-ldap-0.16.0/lib/net/ldap/auth_adapter/simple.rb:23:in ‘bind’ net-ldap-0.16.0/lib/net/ldap/connection.rb:278:in ‘block in bind’ net-ldap-0.16.0/lib/net/ldap/instrumentation.rb:19:in ‘instrument’ net-ldap-0.16.0/lib/net/ldap/connection.rb:275:in ‘bind’ net-ldap-0.16.0/lib/net/ldap.rb:715:in ‘block in open’ net-ldap-0.16.0/lib/net/ldap/instrumentation.rb:19:in ‘instrument’ net-ldap-0.16.0/lib/net/ldap.rb:711:in ‘open’ net-ldap-0.16.0/lib/net/ldap.rb:644:in ‘open’
This particular server seems to have no timeout on its side either and I've seen connections hung at this point for days.
Hi, on certain servers I observe search and even the bind operation to hang for ever.
I've been trying to reproduce this but I don't think I'm using toxiproxy correctly because I'm not seeing the timeout.
# socket_server.rb
require 'socket'
socketServer = TCPServer.open(8080)
while true
Thread.new(socketServer.accept) do |conn|
puts "Accepting connection from: #{conn.peeraddr[2]}"
conn.puts "Close"
conn.close
end
end
➤ ruby socker_server.rb
Accepting connection from: 127.0.0.1
Accepting connection from: 127.0.0.1
➤ docker run -it -p 8474:8474 -p 22220:22220 --network host shopify/toxiproxy
WARNING: Published ports are discarded when using host network mode
INFO[0000] API HTTP server starting host=0.0.0.0 port=8474 version=2.1.4
INFO[0004] Started proxy name=myservice proxy=127.0.0.1:22220 upstream=127.0.0.1:8080
INFO[0004] Accepted client client=127.0.0.1:43286 name=myservice proxy=127.0.0.1:22220 upstream=127.0.0.1:8080
WARN[0004] Source terminated bytes=0 err=read tcp 127.0.0.1:22220->127.0.0.1:43286: use of closed network connection name=myservice
INFO[0004] Accepted client client=127.0.0.1:43290 name=myservice proxy=127.0.0.1:22220 upstream=127.0.0.1:8080
WARN[0004] Source terminated bytes=0 err=read tcp 127.0.0.1:22220->127.0.0.1:43290: use of closed network connection name=myservice
# toxiproxy.rb
require 'logger'
require 'net/ldap'
require 'toxiproxy'
logger = Logger.new(STDOUT)
Toxiproxy.reset
Toxiproxy.populate([
{
name: "myservice",
listen: "127.0.0.1:22220",
upstream: "127.0.0.1:8080"
}
])
options = { :host=>"127.0.0.1", :port=>22220, :encryption=>nil, :connect_timeout=>5 }
Toxiproxy[:myservice].downstream(:timeout, timeout: 5000).apply do
logger.info "Starting TCP connection..."
logger.debug Socket.tcp('127.0.0.1', 22220, connect_timeout: 22).read
logger.info "Done."
logger.info "Starting LDAP connection..."
ldap = Net::LDAP.new(options)
conn = ldap.send :new_connection
logger.debug conn.socket.read
conn.close
logger.info "Done."
end
➤ ruby toxiproxy.rb
I, [2020-10-29T01:03:39.816195 #10079] INFO -- : Starting TCP connection...
D, [2020-10-29T01:03:39.816804 #10079] DEBUG -- :
I, [2020-10-29T01:03:39.816823 #10079] INFO -- : Done.
I, [2020-10-29T01:03:39.816834 #10079] INFO -- : Starting LDAP connection...
D, [2020-10-29T01:03:39.817193 #10079] DEBUG -- :
I, [2020-10-29T01:03:39.817243 #10079] INFO -- : Done.
Late to the party, but I have encountered this occurring with LDAPS (connecting via openssl) more often than unencrypted LDAP lately. Wondering if reproducing would be more reliable with TLS/SSL on the mock server.
We're using version 0.12.1, which should support the
time:
arg in thesearch
method. However, in testing with a network simulator proxy I can't get this argument to actually timeout and abort the search.Example:
I would expect the above query to time out after 10 seconds but it actually goes indefinitely. Is this a bug or am I misunderstand usage?