lightningnetwork / lnd

Lightning Network Daemon ⚡️
MIT License
7.7k stars 2.09k forks source link

lnrpc.ConnectPeerRequest : Timeout does not seem to work at times #6889

Open BhaagBoseDK opened 2 years ago

BhaagBoseDK commented 2 years ago

Background

refer #4123 and #4452 which seem to fix this issue. However, I noticed that the call does not respect the timeout and takes 3+ minutes.

Call

response = stub.ConnectPeer(request = ln.ConnectPeerRequest(addr=address, perm=False, timeout=5))

Result

Mon Sep  5 11:07:46 2022 : ... Attempting connection to peer.alias='MidasMulligan' inactive_peer='03fd21fdee8e9adc070b53a0bca8685cbc0b97f27b72f364723dcc54b4164ab21b' host='322e5mbonli3h4r4la6ox2gt2cm534k7uk3cbl6yr34rtxyfnj6mdiid.onion:9735'
Mon Sep  5 11:11:39 2022 : .... Error reconnecting MidasMulligan inactive_peer='03fd21fdee8e9adc070b53a0bca8685cbc0b97f27b72f364723dcc54b4164ab21b' str(e)='<_InactiveRpcError of RPC that terminated with:\n\tstatus = StatusCode.UNKNOWN\n\tdetails = "dial proxy failed: socks connect tcp 10.21.21.11:9050->322e5mbonli3h4r4la6ox2gt2cm534k7uk3cbl6yr34rtxyfnj6mdiid.onion:9735: unknown error general SOCKS server failure"\n\tdebug_error_string = "{"created":"@1662376299.194364800","description":"Error received from peer ipv4:10.21.21.9:10009","file":"src/core/lib/surface/call.cc","file_line":966,"grpc_message":"dial proxy failed: socks connect tcp 10.21.21.11:9050->322e5mbonli3h4r4la6ox2gt2cm534k7uk3cbl6yr34rtxyfnj6mdiid.onion:9735: unknown error general SOCKS server failure","grpc_status":2}"\n>'

Your environment

Steps to reproduce

This happens only at times. I could not find a specific pattern.

Tell us how to reproduce this issue. Please provide stacktraces and links to code in question.

Expected behaviour

Time out should be respected.

Actual behaviour

At times time call gets stuck.

BhaagBoseDK commented 2 years ago

Another example

umbrel@umbrel:~/utils/mylndg/lndg $ time lncli connect 037c0bb263a4fe95f0ed598b50cf51e5732a6bc1c48d1c81c5e89f2442939dffdd@109.173.205.40:55596 --timout 30s

real    2m2.672s
user    0m2.016s
sys 0m0.318s
positiveblue commented 2 years ago

I think we may need more documentation about this.

The timeout does not applies to the request itself, when the client calls ConnectPeer until it gets a response from the server, but to the dial connection establishment. If there is name resolution So everything else is not counted in that timeout: noise protocol, handshake, waiting for acquiring locks, etc..

I will check to ensure that timeouts apply when using tor dials.

In your second example I think you are using the default timeout (2 min) because of a bad param name --timout 30s instead of --timeout 30