Open WheezyE opened 1 year ago
Thank you to KM4ACK & WH6AZ (of iOS RadioMail) for finding the root cause of this issue!
Just consolidating some notes & tests here on this github ticket so we can work in the open.
TIME_WAIT
state after the last connection on them is terminated. Apparently to prevent DoS attacks & also to prevent packet loss in some edge cases (references: 1, 2, 3)netstat | grep tcp
. VARA's localhost:8300/localhost:8301 ports enter ESTABLISHED
state when an RMS Express VARA HF P2P session is first opened. Then, as soon as VARA HF is closed, the 8300/8301 ports enter TIME_WAIT
state for Linux OS's. I timed how long the ports stay in TIME_WAIT
to be about 60 seconds. Then the ports disappear from netstat (they close & can be re-used again).TIME_WAIT
, VARA HF will currently not attempt to connect to them. We believe that this is the cause of our issue. Said another way: If VARA is run, then VARA is closed, then VARA is run again, VARA will not re-establish a connection to RMS Express (or any other controller program over TCP) within this 60-second window.net.ipv4.tcp_keepalive_time
, net.ipv4.tcp_fin_timeout
, & sunrpc.tcp_fin_timeout
to int 1
doesn't seem to change anything(?) with/without network reset (TIME_WAIT
still stays for 60s). Doing this also wrecks the internet on the Pi.net.ipv4.tcp_tw_reuse
to int 1
(global enable) doesn't change any behavior either.TIME_WAIT
problem, which is a port "busy" state that arises AFTER the port's connection is cut.TIME_WAIT
state on a port and connect anyway _(similar to the C function "SO_REUSEADDR") (1,2)_TIME_WAIT
state. (KM4ACK's idea - he also has a prototype script written to do this).sudo service networking restart
. (This is not a favorable option since it could cause users to lose internet connection / data unexpectedly).I'm going to try the Possible Solution 2 (above): VARA-bridge-Linux for TCP connections, which was also recently suggested by EA5HVK after contacting him.
I'll start trying to write a bridge app in VB6 to see if I can circumvent the TIME_WAIT
condition. If that succeeds, I'll see if sending source code to EA5HVK might help implement it in VARA. If that's not an option, then I'll see if I can complete the bridge app.
random ideas random words, fine to ignore as I haven't done a lot of data gathering to really give any, let alone that pe1rrr level of data!
I can connect and disconnect a lot with no issues it seems TCP connect projects like CHAT (with vara) -like a lot I cant replicate this error per-say but I dont use winlink much.
is this only .. winlink related and I saw possibly KISS connected phone app as well, (ouch just paid for it to debug this more myself: it will be extra handy platform to use since it focuses on vara in wine tcp kiss only really keeping it simple for this thread)
is this a function of a winlink specific clog? like the layer 6-7 needs looked at? with winlink and vara in tcpdump? to find any strange collisions? I was going to try and sniff how my dev box is not impacted (I am on 5.10 still) any all this rambling to hopefully help and say .. is this a winlink only issue? or any TCP applications? VarAC issues? need more eyes on problem for more data to make this go away. I have not looked at the provided links for solutions in detail yet to see if I am fully foolish in saying any of this but .. just sayin I did see network issues once and they did go away for me. I will get more data as time allows on the matter. love to hear more gonna dig into pe1rrr links as soon as possible. :) 73 hows the general license ;)
Your ideas are always welcome! 😃 And thank you for being so interested and wanting to do so much testing.
I can connect and disconnect a lot with no issues it seems TCP connect projects like CHAT (with vara) -like a lot I cant replicate this error per-say but I dont use winlink much.
Over-the-air/radio-signal VARA connections/disconnections should work fine. However, since Linux TCP ports enter a temporary "TIME_WAIT" state after a program closes one of the ports, this usually causes an issue for VARA if any program closes VARA and then re-opens it (like RMS Express), or external programs that try to re-connect to VARA's TCP/IP ports over local/wifi connections (like RadioMail for iPhone).
is this a winlink only issue? or any TCP applications? VarAC issues? It's an issue with VARA - specifically, how VARA has been programmed to deal with TCP port reconnections and TIME_WAIT stuff. I believe that there is a way to work around this in VB6 (the language VARA is programmed in), but I'm not a programmer and also don't have access to VARA's source code to test anything.
To be honest, I'm more interested in the wine/box86/emulation side of things and don't really use or test VARA much otherwise. Last I knew, these issues weren't fixed, but it's possible maybe Jose ended up patching this in. I haven't tested it in a while, but I think pe1rrr would know more since he's tested it more recently.
This is all as far as I know... again, pe1rrr has more first-hand experience with the problem and the ways it impacts users. (@pe1rrr, feel free to correct any info I got wrong here)
This is all as far as I know... again, pe1rrr has more first-hand experience with the problem and the ways it impacts users. (@pe1rrr, feel free to correct any info I got wrong here)
👍 So far so good.
Great summary of an otherwise unfortunate issue. For what it's worth, this problem also occurs with CrossOver on macOS as well.
FYI, anybody looking for a workaround for this, I've created varanny, a launcher for VARA. Amongst other things it helps start/stop VARA instance remotely and also can manage VARA.ini files to allow for multiple configuration to co-exist. It also takes care of service discovery by advertising VARA as DNS-SD. RadioMail has support for this since v 1.3.
This problem affects Linux/Wine, but does not occur on Windows. Fixing this for Linux would make VARA much more usable for users who do not have Windows.