softerhardware / Hermes-Lite2

A second generation low-cost amateur HF software defined radio transceiver.
http://www.hermeslite.com
227 stars 80 forks source link

HL2 stops streaming data when DHCP renewal happens #151

Closed pjsg closed 3 years ago

pjsg commented 3 years ago

I have my HL2 using DHCP to get its address. I have the ciurrent sparksdr running on windows. After a period of time, the hl2 stops streaming data to sparksdr. It recovers instantly if you press the power button (twice) in sparksdr.

It appears that the DHCP Ack packet stops the streaming. The PCAP is attached that covers the packets of interest. This was captured on a span port for the link to the HL2 so it includes all packets send and received by the HL2: hl2.zip

.135 is the hl2, .124 is sparksdr, .68 is the dhcp server

I don't know how good the timestamps are, but there is the additional oddity that the sending of the data packets seems to stop just a little before the DHCP ack is seen. Maybe it isn't the dhcp ack, but the ARP response that causes problems.

pjsg commented 3 years ago

Actually, I think I understand the timestamps -- it looks to me as though the capturing system typically interrupts the kernel for every 2 packets (based on the other timestamps) or on a short timeout. I think that the dhcp response was the first of two packets and so it was only marked as being received after a timeout. Thus, I think, the root cause is the reception of this packet.

Also note that this gateware is my code with the IP ids -- you can see these incrementing nicely. Unfortunately I didn't see if a ping would use the next number, or whether there are some lost packets. I'll redo the test today.

softerhardware commented 3 years ago

Hi Philip,

Thanks for the pcap file. I did take a look last night. I see there are more DHCP options than typically seen in the ACK but could find no problems in the RTL with this. (In the past we had an issue with some options.)

To clarify, are you saying that the HL2 really stops sending after the DHCP RQST and not after the DHCP ACK as seen in the pcap file?

In your network, does a DHCP RQST result in any physical change to the connection, such as renegotiate the speed? This could cause the DHCP FSM to reset.

What happens after the HL2 stops sending data? In the FSM, the HL2 will wait for 20 seconds for the DHCP ACK. If not DHCP ACK is seen, it will wait for 5 minutes and then attempt another DHCP renewal request. Do you see that?

After the HL2 stops sending data, are you able to ping the HL2? What is the status of the 4 LEDs on the HL2?

You mention that you are using your mod with the IP ids. Is this build now timing clean? Timing issues could lead to some of the strange behavior described.

I will reduce my DHCP's lease time and capture a few DHCP renewal sequences for my network to compare.

73,

Steve kf7o

softerhardware commented 3 years ago

Hi Philip,

I captured several HL2 DHCP renewals on my network. The HL2 continued to send data after the renewal. The only difference I noticed was that the DHCP ACK from my server included 8 bytes of padding after the option end of 0xff. Can you enable padding on your server to see if the HL2 is sensitive to this? I can't figure out how to disable it on my. I am running https://freshtomato.org/ which uses dnsmasq for DHCP services. See the picture below. wireshark1

I was running two 384kHz receivers for this test. To match your setup, which software, how many receivers, and what receiver bandwidth are you using?

73,

Steve kf7o

pjsg commented 3 years ago

I'm running 10 rx at 192k. I'm currently waiting for it to stop again. I may have to reduce the lease time....

On Wed, Dec 30, 2020, 15:11 Steve Haynal notifications@github.com wrote:

Hi Philip,

I captured several HL2 DHCP renewals on my network. The HL2 continued to send data after the renewal. The only difference I noticed was that the DHCP ACK from my server included 8 bytes of padding after the option end of 0xff. Can you enable padding on your server to see if the HL2 is sensitive to this? I can't figure out how to disable it on my. I am running https://freshtomato.org/ which uses dnsmasq for DHCP services. See the picture below. [image: wireshark1] https://user-images.githubusercontent.com/6146461/103378714-1c95dd00-4a98-11eb-91a2-4f4d2af79995.png

I was running two 384kHz receivers for this test. To match your setup, which software, how many receivers, and what receiver bandwidth are you using?

73,

Steve kf7o

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/softerhardware/Hermes-Lite2/issues/151#issuecomment-752744000, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALQLTOCO7ZDXTCGSM77BMDSXOCQLANCNFSM4VNUOS4Q .

pjsg commented 3 years ago

I figured it out -- it was the mysterious ICMP unreachable that was the problem. I don't know why it was sent, but the unreachable handler stops streaming. I have a mod that I'm about to test that only stops streaming if the source port on the returned pack is 1024. I'll put up a PR.

Sorry for the confusion.

Philip

On Wed, Dec 30, 2020 at 3:30 PM Philip Gladstone philip@gladstonefamily.net wrote:

I'm running 10 rx at 192k. I'm currently waiting for it to stop again. I may have to reduce the lease time....

On Wed, Dec 30, 2020, 15:11 Steve Haynal notifications@github.com wrote:

Hi Philip,

I captured several HL2 DHCP renewals on my network. The HL2 continued to send data after the renewal. The only difference I noticed was that the DHCP ACK from my server included 8 bytes of padding after the option end of 0xff. Can you enable padding on your server to see if the HL2 is sensitive to this? I can't figure out how to disable it on my. I am running https://freshtomato.org/ which uses dnsmasq for DHCP services. See the picture below. [image: wireshark1] https://user-images.githubusercontent.com/6146461/103378714-1c95dd00-4a98-11eb-91a2-4f4d2af79995.png

I was running two 384kHz receivers for this test. To match your setup, which software, how many receivers, and what receiver bandwidth are you using?

73,

Steve kf7o

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/softerhardware/Hermes-Lite2/issues/151#issuecomment-752744000, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALQLTOCO7ZDXTCGSM77BMDSXOCQLANCNFSM4VNUOS4Q .