Open dbehnke opened 3 months ago
What version of ASL3 do you have installed? If you are running a version prior to v3.0.4 then you are missing the fix for https://github.com/AllStarLink/app_rpt/pull/372.
Also, we recently had another reported issue, https://github.com/AllStarLink/app_rpt/issues/386, that turned out to be a network configuration issue.
From the node, do you have reasonable network connectivity? Can you resolve/ping hostnames (e.g. ping register.allstarlink.org
)?
I'm checking the notes from the previous ticket there to see if this applys here.. in the meantime here's ping results..
Version is Asterisk 20.9.1+asl3-3.0.4-1.deb12
ping register.allstarlink.org
PING register.allstarlink.org (162.248.92.131) 56(84) bytes of data.
64 bytes from v-162-248-92-131.unman-vds.premium-chicago.nfoservers.com (162.248.92.131): icmp_seq=1 ttl=56 time=30.2 ms
64 bytes from v-162-248-92-131.unman-vds.premium-chicago.nfoservers.com (162.248.92.131): icmp_seq=2 ttl=56 time=25.6 ms
64 bytes from v-162-248-92-131.unman-vds.premium-chicago.nfoservers.com (162.248.92.131): icmp_seq=3 ttl=56 time=26.2 ms
64 bytes from v-162-248-92-131.unman-vds.premium-chicago.nfoservers.com (162.248.92.131): icmp_seq=4 ttl=56 time=25.9 ms
64 bytes from v-162-248-92-131.unman-vds.premium-chicago.nfoservers.com (162.248.92.131): icmp_seq=5 ttl=56 time=22.0 ms
64 bytes from v-162-248-92-131.unman-vds.premium-chicago.nfoservers.com (162.248.92.131): icmp_seq=6 ttl=56 time=32.6 ms
64 bytes from v-162-248-92-131.unman-vds.premium-chicago.nfoservers.com (162.248.92.131): icmp_seq=7 ttl=56 time=34.9 ms
64 bytes from v-162-248-92-131.unman-vds.premium-chicago.nfoservers.com (162.248.92.131): icmp_seq=8 ttl=56 time=28.4 ms
64 bytes from v-162-248-92-131.unman-vds.premium-chicago.nfoservers.com (162.248.92.131): icmp_seq=9 ttl=56 time=26.7 ms
It have determined this issue is with Tailscale and not Allstar. Tailscale is doing odd things with its internal DNS which is supposed to passthrough to a provider of your choice.. i.e. 8.8.8.8
I still don't quite understand why that would cut the numbers off at 4 digits.. but there we are.
The solution for me was to just disable Tailscale's DNS and use 8.8.8.8 directly on the OS.
For future viewers that use tailscale...
sudo tailscale up --accept-dns=false --accept-routes
If you want you can leave the issue open for a day or two in case anyone else has comments/suggestions..
Great to hear that you have solved the problem.
Dave,
Clearly, the simple "can we resolve "a" DNS name was not a good "quick" DNS test. How did you determine that the issue was with Tailscale's DNS? Is there a good test we could add (or at least document) that would help identify this issue?
So for tailscale.. they have something called tailnet.. that allows you to do dns lookups to your other nodes. and when you enable tailscale with tailscale up
it will change your dns server to 100.100.100.100 which is tailscale's internal DNS. I noticed this by looking at my /etc/resolv.conf that it was set to that. If I temporily changed it to 8.8.8.8 in the /etc/resolv.conf, I was then able to connect to allstar nodes. So what I think happens when you have it set to 100.100.100.100 it tries to then bounce to your internal address .. in my case 192.168.0.53 is my dns locally. and it can't do that from its point of view in the cloud. One thing I didn't try (in the screenshot below) is to flip the switch to override local DNS... i'm wondering if that would also work.
This is likely related to a know DNS services defect on the infra side. Try not using the intercepted DNS.
Yeah I've kinda given up for now with this .. even with tailscale on and disabling intercepting DNS (using 8.8.8.8) it still has an issue. The only way i was able to work around the issue for now is to use FILE mode in rpt.conf (turning off DNS lookup) or turn off tailscale and then DNS works.
I use tailscale as a vpn between my nodes and as an exit node for when i'm on cellular to appear with the same ip. (ATT likes to change IPs unfortunately). Not sure what's changed between first release and 3.0.4 but something broke using tailscale for me.
@dbehnke Would you be willing to setup your system with the failing configuration for me to get some debugging information. We need to at least understand why this causes the failure.
If you are willing have some questions and steps.
Here is some debugging information from my computer. I was able to get the failure as soon as I started asterisk. I will note that I enabled res_resolver_unbounded.so and set the nameserver to 1.1.1.1.
k4mdddevelop*CLI> rpt fun 2231 *353431
[2024-08-26 15:22:19.543] DEBUG[43943]: app_rpt/rpt_telemetry.c:381 cancel_pfxtone: cancel_pfxfone!!
[2024-08-26 15:22:19.543] DEBUG[43943]: app_rpt.c:1435 collect_function_digits: digits=3 source=0
[2024-08-26 15:22:19.543] DEBUG[43943]: app_rpt.c:1487 collect_function_digits: @@@@ action: ilink, param = 3
[2024-08-26 15:22:19.543] DEBUG[43943]: app_rpt.c:1493 collect_function_digits: @@@@ table index i = 3
[2024-08-26 15:22:19.543] DEBUG[43943]: app_rpt/rpt_functions.c:69 function_ilink: @@@@ ilink param = 3, digitbuf =
[2024-08-26 15:22:19.543] DEBUG[43943]: app_rpt.c:1505 collect_function_digits: rv=0
[2024-08-26 15:22:19.654] DEBUG[43943]: app_rpt/rpt_telemetry.c:381 cancel_pfxtone: cancel_pfxfone!!
[2024-08-26 15:22:19.654] DEBUG[43943]: app_rpt.c:1435 collect_function_digits: digits=35 source=0
[2024-08-26 15:22:19.654] DEBUG[43943]: app_rpt.c:1487 collect_function_digits: @@@@ action: ilink, param = 3
[2024-08-26 15:22:19.655] DEBUG[43943]: app_rpt.c:1493 collect_function_digits: @@@@ table index i = 3
[2024-08-26 15:22:19.655] DEBUG[43943]: app_rpt/rpt_functions.c:69 function_ilink: @@@@ ilink param = 3, digitbuf = 5
[2024-08-26 15:22:19.655] DEBUG[43943]: app_rpt/rpt_config.c:312 tlb_query_node_exists: chan_tlb not loaded.
[2024-08-26 15:22:19.655] NOTICE[43943]: app_rpt/rpt_config.c:624 node_lookup: Longest node 6.
[2024-08-26 15:22:19.655] DEBUG[43943]: app_rpt.c:1505 collect_function_digits: rv=0
[2024-08-26 15:22:19.765] DEBUG[43943]: app_rpt/rpt_telemetry.c:381 cancel_pfxtone: cancel_pfxfone!!
[2024-08-26 15:22:19.765] DEBUG[43943]: app_rpt.c:1435 collect_function_digits: digits=353 source=0
[2024-08-26 15:22:19.766] DEBUG[43943]: app_rpt.c:1487 collect_function_digits: @@@@ action: ilink, param = 3
[2024-08-26 15:22:19.766] DEBUG[43943]: app_rpt.c:1493 collect_function_digits: @@@@ table index i = 3
[2024-08-26 15:22:19.766] DEBUG[43943]: app_rpt/rpt_functions.c:69 function_ilink: @@@@ ilink param = 3, digitbuf = 53
[2024-08-26 15:22:19.766] DEBUG[43943]: app_rpt/rpt_config.c:312 tlb_query_node_exists: chan_tlb not loaded.
[2024-08-26 15:22:19.766] NOTICE[43943]: app_rpt/rpt_config.c:624 node_lookup: Longest node 6.
[2024-08-26 15:22:19.766] DEBUG[43943]: app_rpt.c:1505 collect_function_digits: rv=0
[2024-08-26 15:22:19.885] DEBUG[43943]: app_rpt/rpt_telemetry.c:381 cancel_pfxtone: cancel_pfxfone!!
[2024-08-26 15:22:19.885] DEBUG[43943]: app_rpt.c:1435 collect_function_digits: digits=3534 source=0
[2024-08-26 15:22:19.885] DEBUG[43943]: app_rpt.c:1487 collect_function_digits: @@@@ action: ilink, param = 3
[2024-08-26 15:22:19.885] DEBUG[43943]: app_rpt.c:1493 collect_function_digits: @@@@ table index i = 3
[2024-08-26 15:22:19.885] DEBUG[43943]: app_rpt/rpt_functions.c:69 function_ilink: @@@@ ilink param = 3, digitbuf = 534
[2024-08-26 15:22:19.886] DEBUG[43943]: app_rpt/rpt_config.c:312 tlb_query_node_exists: chan_tlb not loaded.
[2024-08-26 15:22:19.886] NOTICE[43943]: app_rpt/rpt_config.c:624 node_lookup: Longest node 6.
[2024-08-26 15:22:19.886] DEBUG[43943]: app_rpt.c:1505 collect_function_digits: rv=0
[2024-08-26 15:22:19.968] == Manager 'admin' logged on from 127.0.0.1
[2024-08-26 15:22:19.969] == Manager 'admin' logged on from 127.0.0.1
[2024-08-26 15:22:19.995] DEBUG[43943]: app_rpt/rpt_telemetry.c:381 cancel_pfxtone: cancel_pfxfone!!
[2024-08-26 15:22:19.995] DEBUG[43943]: app_rpt.c:1435 collect_function_digits: digits=35343 source=0
[2024-08-26 15:22:19.995] DEBUG[43943]: app_rpt.c:1487 collect_function_digits: @@@@ action: ilink, param = 3
[2024-08-26 15:22:19.996] DEBUG[43943]: app_rpt.c:1493 collect_function_digits: @@@@ table index i = 3
[2024-08-26 15:22:19.996] DEBUG[43943]: app_rpt/rpt_functions.c:69 function_ilink: @@@@ ilink param = 3, digitbuf = 5343
[2024-08-26 15:22:19.996] DEBUG[43943]: app_rpt/rpt_config.c:312 tlb_query_node_exists: chan_tlb not loaded.
[2024-08-26 15:22:19.996] DEBUG[43943]: app_rpt/rpt_config.c:396 node_lookup_bydns: Resolving DNS SRV records for: _iax._udp.5343.nodes.allstarlink.org
[2024-08-26 15:22:23.137] DEBUG[43943]: app_rpt/rpt_config.c:411 node_lookup_bydns: No SRV records returned for _iax._udp.5343.nodes.allstarlink.org
[2024-08-26 15:22:23.137] NOTICE[43943]: app_rpt/rpt_config.c:624 node_lookup: Longest node 6.
[2024-08-26 15:22:23.138] DEBUG[43943]: app_rpt.c:1505 collect_function_digits: rv=0
[2024-08-26 15:22:23.138] DEBUG[43943]: app_rpt/rpt_telemetry.c:381 cancel_pfxtone: cancel_pfxfone!!
The main problem is the time to look up DNS records. I have confirmed it here.
app_rpt does not process more that 4 digits due to the DTMF timer expiring. The DTMF timer is hard coded for 3 seconds. If the DNS lookup does not complete in 3 seconds, the DTMF timer expires and the DTMF buffer is reset. At that point you see a cancel_pfxfone!!
message and DTMF processing stops.
So, to be clear, there's a complete time allocation of 3s to receive all DTMF tones and then execute the complete round-trip of the DNS lookup? In general, the DNS lookup de facto timers are to allocate 4 seconds for DNS lookups (2s for a first lookup, 2s to do a second resolver lookup if the first fails/timesout).
I was not clear. The DTMF timer is reset with each digit.
Describe the bug When I attempt to do *3 + >4 digit node number, linking does not work.. digits after 4 get cut off
To Reproduce Steps to reproduce the behavior:
Expected behavior I can dial node numbers > 4 digits
sample log This is me trying to connect to 594950 using dtmf, notice only 5949 gets picked up