Closed hg closed 2 years ago
This also flushed out an old bug which I've often run into in the past:
pub
with a publicly available IP and port.nat
behind NAT or firewall.pub
starts connecting to nat
and fails because of firewall/NAT.nat
successfully connects back to pub
.pub
still thinks it has a pending outgoing connection to nat
.pub
adds nat
's public address to address cache (in pong
handler, graph.c
, or any other place), not realizing it's using a "client" port which won't be available anymore after the current connection is closed.One might wonder why node nat
is being given a public address when it cannot accept incoming connections, but it makes sense if there are multiple nodes behind the firewall. External nodes can be configured to ignore them, at the cost of much additional complexity in Ansible playbooks.
Here's how it looks in logs:
2022-05-20 18:12:13 NOTICE Connection from x.y.z.q port 60310
2022-05-20 18:12:13 NOTICE Trying to re-establish outgoing connection in 5 seconds
2022-05-20 18:12:13 NOTICE Connection with foo (x.y.z.q port 60310) activated
2022-05-20 18:12:18 INFO Already connected to foo
2022-05-20 18:12:18 DEBUG Caching recent address for foo
After restarting tincd it uses wrongly cached ports (both nodes should use 655):
2022-05-20 18:14:56 DEBUG Error while connecting to foo (x.q.z.y port 35870): Connection refused
2022-05-20 18:14:57 DEBUG Error while connecting to bar (x.y.z.q port 60310): Connection refused
The last commit attempts to fix that. It may be completely wrong because I'm not sure why we're retrying outgoing connections if there already is a working connection.
I'm getting a failure in the fs test:
not ok 10 - test_fopenmask_new
# 0x1e8 != 0x1c0
# ../test/unit/test_fs.c:210: error: Failure!
The reason is that I have my umask set to 077. The fs unit tests should probably call umask(0)
before doing anything else.
Thanks, nice work again! One thing I want to note though is that you often add unrelated commits to a pull request. For example, in this PR you fix an outgoing connection issue and add Markdown reformatting to lint.py. Those have nothing to do with the address cache. The refactoring of filesystem-related functions is somewhat related, as it cleans up some of the code dealing with making directories, but it could also have been in a separate PR. Splitting them makes reviewing easier and increases the likelihood that I can merge individual PRs earlier.
See #373.
Cache directory is recreated as necessary. While it may be annoying for some users, I don't think it has any real downsides (besides maybe adding a small amount of flash wear on old routers), and there are lots of existing configurations out there that could benefit from this cache. I don't expect many users to read release notes for all software they use.