Unidata / LDM

The Unidata Local Data Manager (LDM) system includes network client and server programs designed for event-driven data distribution, and is the fundamental component of the Unidata Internet Data Distribution (IDD) system.
http://www.unidata.ucar.edu/software/ldm
Other
43 stars 27 forks source link

LDM should reorder UDP packets before ingesting them #74

Open childofthewired opened 3 years ago

childofthewired commented 3 years ago

LDM Version 6.13.12.69

Environment:

Red Hat Enterprise Linux 7 RHV Hypervisor Juniper Switch/Cisco Switch Satellite Receiver

NOAAPORT UDP Packets that do not arrive in order are rejected and readnoaaport.c throws an error.

While this appears to be by design, this impacts utilization of LDM in a load balanced or virtual machine environment where Bonded NICs are used in either the hypervisor or on a physical host.

We have worked around this by disabling the bonded for the interface that the NOAAPORT data arrives on, but this greatly reduces the reliability of the hardware.

LDM cannot reorder the UDP packets, and so it drops the product, even though the entire product exists in the data.

RFC guidelines for UDP usage states that: "Applications that require ordered delivery MUST reestablish datagram ordering themselves."

https://tools.ietf.org/html/rfc8085#section-3.3

semmerson commented 3 years ago

Hi @childofthewired,

The LDM was designed assuming that the DVB-S receiver would have a hard wired connection (with some layer 2 switches, possibly) to the computer running the noaaportIngester(1) program. That's the case here, at our client universities, and at all of NOAA's WFO-s.

We run multiple instances of noaaportIngester(1) on separate computers for redundancy and have a reliability rate that's at least 99.999% AFAIK, the AWIPS system does the same.

WAN applications that require UDP packets to be delivered in order should, indeed, have a mechanism for re-ordering packets. There is a piece of software from the University of Wisconsin that performs this. We've successfully used this software between the NOAAPort receiver and the noaaportIngester(1) program over a WAN. Perhaps that could solve your problem.

johnsimcall commented 1 year ago

Thanks @semmerson , and sorry for resurrecting an old thread. I tried to search for the University of Wisconsin software you mentioned to re-order UDP packets, but couldn't find anything that looked right. Can you point me in the right direction, please?

semmerson commented 1 year ago

@johnsimcall Hang on. We're talking amongst ourselves about the best solution for you.

semmerson commented 1 year ago

@johnsimcall A Novra can't have a bonded NIC. Would you please explain how one comes about and how it increases reliability when the Novra can't use one.

semmerson commented 1 year ago

@johnsimcall Would it be possible to use active-backup mode in the bonded interface? This would ensure redundancy, and with a sufficiently large receive buffer setting in noaaportIngester(1), should allow a VM to easily keep up with the maximum NOAAPort bit-rate of 60 MHz.

johnsimcall commented 1 year ago

Thanks @semmerson , you're right, the Novra has a single network connection. I'll attempt to better describe the environment where @childofthewired and I are seeing out-of-order packets.

The Novra is connected to a switch (switch1) which in turn connects to a pair of Juniper switches (switch2 & switch3) that are configured as a single logical unit / virtual chassis. The hypervisor server (Dell) is connected, via LACP/802.3ad bonding, to switch2 & switch3. A Virtual Machine on the hypervisor server, with a single virtual NIC, runs the LDM software and sees out-of-order packet delivery.

                   / -- switch2 -- \
Novra -- switch1 <        ||         > == Dell ~~ VM(LDM)
                   \ -- switch3 -- /

We have also tried to connect switch1 directly to the Dell server, but we still see out-of-order issues.

Thank you for suggesting to create a large receive buffer setting in noaaportIngester, we'll take a look at that. I'm also going to see if the Juniper equipment being used supports the "strict-packet-order" configuration. The documentation says

strict-packet-order | You can use this command to maintain multicast traffic order and resolve packet drop issue

semmerson commented 1 year ago

Hi John,

The Novra is connected to a switch (switch1) which in turn connects to a

pair of Juniper switches (switch2 & switch3) that are configured as a single logical unit / virtual chassis https://www.juniper.net/documentation/us/en/software/junos/virtual-chassis-qfx/index.html. The hypervisor server (Dell) is connected, via LACP/802.3ad bonding, to switch2 & switch3.

One of us here thinks that 802.3ad bonding should act similar to "active-backup" mode -- which doesn't reorder packets -- but isn't certain.

We have also tried to connect switch1 directly to the Dell server, but we still see out-of-order issues.

That's consistent with the Dell hypervisor reordering the packets.

Thank you for suggesting to create a large receive buffer setting in noaaportIngester, we'll take a look at that.

If the hypervisor is reordering the packets, then that won't work. Please let us know.

I'm also going to see if the Juniper equipment being used supports the "strict-packet-order" configuration. The documentation https://www.juniper.net/documentation/us/en/software/junos/flow-packet-processing/topics/ref/statement/security-edit-flow-security-flow.html says

strict-packet-order | You can use this command to maintain multicast traffic order and resolve packet drop issue

Same issue if the problem lies with the hypervisor.

We do use a utility here that sits between the NOAAPort stream and noaaportIngester(1) and ensures that the NOAAPort frames are in strictly monotonic order. It might need modification to fit your situation.

If you would like to Google Meet to discuss this reordering issue, we're available.

--Steve

Message ID: @.***>

johnsimcall commented 1 year ago

If you would like to Google Meet to discuss this reordering issue, we're available.

Thank you @semmerson ! I'll reach out after the New Year to see if we can chat for a few minutes. Happy holidays!

johnsimcall commented 6 months ago

Oops, I forgot to post the resolution to this which was discovered by Sean Webb in Jan 2023. Sean discovered that having two NICs up/online resulted in duplicated, dropped, and out-of-order packet delivery. Shutting down the second NIC resolved the issue -- however the procedure for shutting down the NIC changed between RHEL7 (ifdown eth0) and RHEL8 (ip link set eth0 down) Please note that nmcli con down eth0 command in RHEL8 is not sufficient because that command removes the IP configuration from the NIC, but doesn't set the link status to down. A custom NetworkManager dispatcher script can be created to set the link status to down when the second/backup NIC is not being used.

Ok, when I was taking another look at this to get some RHEL7 vs RHEL8 packet captures, I found the issue So one thing we didn't show is that we actually have 2 Novra DVB receivers. I think the two paths look like this:

Novra1 ---> | switch1, port 20 (vlan.101) |
            | switch1, port 21 (vlan.101) | ---> Dell Server1 eno3/sbn1
            | switch1, port 22 (vlan.101) | ---> Dell Server2 eno3/sbn1
            | switch1, port 23 (vlan.101) | ---> Dell Server3 eno3/sbn1 == linux-rhv-bridge == VM (rhel8-vm1 eth0)

Novra2 ---> | switch2, port 20 (vlan.201) |
            | switch2, port 21 (vlan.201) | ---> Dell Server1 eno4/sbn2
            | switch2, port 22 (vlan.201) | ---> Dell Server2 eno4/sbn2
            | switch2, port 23 (vlan.201) | ---> Dell Server3 eno4/sbn2 == linux-rhv-bridge == VM (rhel8-vm1 eth1)

If I change one of the VM's NIC link state to DOWN then the GAPS GO AWAY! So this issue hasn't been that the data is coming in out of order on the interface, the issue is that BOTH interfaces are simultaneously broadcasting their multicast data even when we run "nmcli con down eth0". So LDM was receiving the multicast data from BOTH interfaces and seeing them out of order and discarding most of the data.

The difference is how we are managing the interface between rhel7 and rhel8. In rhel7 we were using ifdown eth0 to shut down the inactive sbn interface, in which case the interface looked like this: 4: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisk noop state DOWN group default qlen 1000 link/ether 56:6f:5d:e2:00:26 brd ff:ff:ff:ff:ff:ff

In rhel8, we were using "nmcli con down eth0" but the interface was still UP, it just didn't have an IP assigned 4: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 56:6f:5d:e2:00:26 brd ff:ff:ff:ff:ff:ff

stonecooper commented 6 months ago

You can actively feed from two (or more) Novra modems at the same time to create data feed redundancy by at least a couple of mechanisms.

The longest utilized methodology takes advantage of the NAT'ing capability of the Novra modem itself. If using the Linux cmcs command, the first step is to save off the working configuration of the modem using the following:

cmcs -ip -pw -save working_configuration.xml

The resulting file can be hand-edited to create a NAT'ing of the multicast IP address, such that the block that looks like this:

Would be look like this:

<CONTENT>
    <TRANSPORT_STREAM PIDS="Selected">
        <PID Number="101" Processing="MPE" />
        <PID Number="102" Processing="MPE" />
        <PID Number="103" Processing="MPE" />
        <PID Number="104" Processing="MPE" />
        <PID Number="105" Processing="MPE" />
        <PID Number="106" Processing="MPE" />
        <PID Number="107" Processing="MPE" />
        <PID Number="108" Processing="MPE" />
        <PID Number="150" Processing="MPE" />
        <PID Number="151" Processing="MPE" />
        <PID Number="NULL" Processing="RAW" />
    </TRANSPORT_STREAM>
    <IP_REMAP_TABLE Enabled="true" RemapSourceIP="false">
        <IP_Remap_Rule Original_IP="224.0.1.1" New_IP="224.3.2.1"

Mask="255.255.255.255" TTL="0" Action="Forward" /> <IP_Remap_Rule Original_IP="224.0.1.2" New_IP="224.3.2.2" Mask="255.255.255.255" TTL="0" Action="Forward" /> <IP_Remap_Rule Original_IP="224.0.1.3" New_IP="224.3.2.3" Mask="255.255.255.255" TTL="0" Action="Forward" /> <IP_Remap_Rule Original_IP="224.0.1.4" New_IP="224.3.2.4" Mask="255.255.255.255" TTL="0" Action="Forward" /> <IP_Remap_Rule Original_IP="224.0.1.5" New_IP="224.3.2.5" Mask="255.255.255.255" TTL="0" Action="Forward" /> <IP_Remap_Rule Original_IP="224.0.1.6" New_IP="224.3.2.6" Mask="255.255.255.255" TTL="0" Action="Forward" /> <IP_Remap_Rule Original_IP="224.0.1.7" New_IP="224.3.2.7" Mask="255.255.255.255" TTL="0" Action="Forward" /> <IP_Remap_Rule Original_IP="224.0.1.8" New_IP="224.3.2.8" Mask="255.255.255.255" TTL="0" Action="Forward" /> <IP_Remap_Rule Original_IP="224.0.1.9" New_IP="224.3.2.9" Mask="255.255.255.255" TTL="0" Action="Forward" /> <IP_Remap_Rule Original_IP="224.0.1.10" New_IP="224.3.2.10" Mask="255.255.255.255" TTL="0" Action="Forward" />

And this would be for only one of the Novras, leaving the other the same. This will shift the traffic from the altered Novra for the NMC channel, for instance, from 224.0.1.1:1201 to 224.3.2.1:1201 . . . you could even have both modems on the same NIC via a hub or switch, and use the "-m" flag for the noaaportIngester to differentiate the shifted multicast address. Two instances of noaaportIngester would be receiving the same PID, but from two different modems, and deduplication would occur on the LDM queue.

Another method would be more complicated and use the newly released "blender" that is now part of the LDM source. It does require a lot more setup, but basically you would stream the multicast data directly into a "fanout" service using socat to translate from the UDP multicast to a TCP p2p stream, and then the "blender" would accept streams from multiple channels to merge the streams at a frame level, with the purpose of have a near-perfect stream.

Stonie Cooper, PhD Software Engineer III NSF Unidata Program Center University Corporation for Atmospheric Research I acknowledge that the land I live and work on is the traditional home of The Chahiksichahiks (Pawnee), The Umoⁿhoⁿ (Omaha), and The Jiwere (Otoe).

On Tue, Jan 2, 2024 at 7:54 PM John Call @.***> wrote:

Oops, I forgot to post the resolution to this which was discovered by Sean Webb in Jan 2023. Sean discovered that having two NICs up/online resulted in duplicated, dropped, and out-of-order packet delivery. Shutting down the second NIC resolved the issue -- however the procedure for shutting down the NIC changed between RHEL7 (ifdown eth0) and RHEL8 (ip link set eth0 down) Please note that nmcli con down eth0 command in RHEL8 is not sufficient because that command removes the IP configuration from the NIC, but doesn't set the link status to down. A custom NetworkManager dispatcher script https://man.archlinux.org/man/NetworkManager-dispatcher.8.en can be created to set the link status to down when the second/backup NIC is not being used.

Ok, when I was taking another look at this to get some RHEL7 vs RHEL8 packet captures, I found the issue So one thing we didn't show is that we actually have 2 Novra DVB receivers. I think the two paths look like this:

Novra1 ---> | switch1, port 20 (vlan.101) | | switch1, port 21 (vlan.101) | ---> Dell Server1 eno3/sbn1 | switch1, port 22 (vlan.101) | ---> Dell Server2 eno3/sbn1 | switch1, port 23 (vlan.101) | ---> Dell Server3 eno3/sbn1 == linux-rhv-bridge == VM (rhel8-vm1 eth0)

Novra2 ---> | switch2, port 20 (vlan.201) | | switch2, port 21 (vlan.201) | ---> Dell Server1 eno4/sbn2 | switch2, port 22 (vlan.201) | ---> Dell Server2 eno4/sbn2 | switch2, port 23 (vlan.201) | ---> Dell Server3 eno4/sbn2 == linux-rhv-bridge == VM (rhel8-vm1 eth1)

If I change one of the VM's NIC link state to DOWN then the GAPS GO AWAY! So this issue hasn't been that the data is coming in out of order on the interface, the issue is that BOTH interfaces are simultaneously broadcasting their multicast data even when we run "nmcli con down eth0". So LDM was receiving the multicast data from BOTH interfaces and seeing them out of order and discarding most of the data.

The difference is how we are managing the interface between rhel7 and rhel8. In rhel7 we were using ifdown eth0 to shut down the inactive sbn interface, in which case the interface looked like this: 4: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisk noop state DOWN group default qlen 1000 link/ether 56:6f:5d:e2:00:26 brd ff:ff:ff:ff:ff:ff

In rhel8, we were using "nmcli con down eth0" but the interface was still UP, it just didn't have an IP assigned 4: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 56:6f:5d:e2:00:26 brd ff:ff:ff:ff:ff:ff

— Reply to this email directly, view it on GitHub https://github.com/Unidata/LDM/issues/74#issuecomment-1874762981, or unsubscribe https://github.com/notifications/unsubscribe-auth/A3LXQSNQ7TBKD5X6CIILAY3YMS26FAVCNFSM4RXILTL2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOBXGQ3TMMRZHAYQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

stonecooper commented 6 months ago

Additionally, the edited xml file will need to be loaded using the cmcs -load edited_configuration.xml command.

Stonie Cooper, PhD Software Engineer III NSF Unidata Program Center University Corporation for Atmospheric Research I acknowledge that the land I live and work on is the traditional home of The Chahiksichahiks (Pawnee), The Umoⁿhoⁿ (Omaha), and The Jiwere (Otoe).

On Tue, Jan 2, 2024 at 8:23 PM Stonie Cooper @.***> wrote:

You can actively feed from two (or more) Novra modems at the same time to create data feed redundancy by at least a couple of mechanisms.

The longest utilized methodology takes advantage of the NAT'ing capability of the Novra modem itself. If using the Linux cmcs command, the first step is to save off the working configuration of the modem using the following:

cmcs -ip -pw -save working_configuration.xml

The resulting file can be hand-edited to create a NAT'ing of the multicast IP address, such that the block that looks like this:

Would be look like this:

<CONTENT>
    <TRANSPORT_STREAM PIDS="Selected">
        <PID Number="101" Processing="MPE" />
        <PID Number="102" Processing="MPE" />
        <PID Number="103" Processing="MPE" />
        <PID Number="104" Processing="MPE" />
        <PID Number="105" Processing="MPE" />
        <PID Number="106" Processing="MPE" />
        <PID Number="107" Processing="MPE" />
        <PID Number="108" Processing="MPE" />
        <PID Number="150" Processing="MPE" />
        <PID Number="151" Processing="MPE" />
        <PID Number="NULL" Processing="RAW" />
    </TRANSPORT_STREAM>
    <IP_REMAP_TABLE Enabled="true" RemapSourceIP="false">
        <IP_Remap_Rule Original_IP="224.0.1.1" New_IP="224.3.2.1"

Mask="255.255.255.255" TTL="0" Action="Forward" /> <IP_Remap_Rule Original_IP="224.0.1.2" New_IP="224.3.2.2" Mask="255.255.255.255" TTL="0" Action="Forward" /> <IP_Remap_Rule Original_IP="224.0.1.3" New_IP="224.3.2.3" Mask="255.255.255.255" TTL="0" Action="Forward" /> <IP_Remap_Rule Original_IP="224.0.1.4" New_IP="224.3.2.4" Mask="255.255.255.255" TTL="0" Action="Forward" /> <IP_Remap_Rule Original_IP="224.0.1.5" New_IP="224.3.2.5" Mask="255.255.255.255" TTL="0" Action="Forward" /> <IP_Remap_Rule Original_IP="224.0.1.6" New_IP="224.3.2.6" Mask="255.255.255.255" TTL="0" Action="Forward" /> <IP_Remap_Rule Original_IP="224.0.1.7" New_IP="224.3.2.7" Mask="255.255.255.255" TTL="0" Action="Forward" /> <IP_Remap_Rule Original_IP="224.0.1.8" New_IP="224.3.2.8" Mask="255.255.255.255" TTL="0" Action="Forward" /> <IP_Remap_Rule Original_IP="224.0.1.9" New_IP="224.3.2.9" Mask="255.255.255.255" TTL="0" Action="Forward" /> <IP_Remap_Rule Original_IP="224.0.1.10" New_IP="224.3.2.10" Mask="255.255.255.255" TTL="0" Action="Forward" />

And this would be for only one of the Novras, leaving the other the same. This will shift the traffic from the altered Novra for the NMC channel, for instance, from 224.0.1.1:1201 to 224.3.2.1:1201 . . . you could even have both modems on the same NIC via a hub or switch, and use the "-m" flag for the noaaportIngester to differentiate the shifted multicast address. Two instances of noaaportIngester would be receiving the same PID, but from two different modems, and deduplication would occur on the LDM queue.

Another method would be more complicated and use the newly released "blender" that is now part of the LDM source. It does require a lot more setup, but basically you would stream the multicast data directly into a "fanout" service using socat to translate from the UDP multicast to a TCP p2p stream, and then the "blender" would accept streams from multiple channels to merge the streams at a frame level, with the purpose of have a near-perfect stream.

Stonie Cooper, PhD Software Engineer III NSF Unidata Program Center University Corporation for Atmospheric Research I acknowledge that the land I live and work on is the traditional home of The Chahiksichahiks (Pawnee), The Umoⁿhoⁿ (Omaha), and The Jiwere (Otoe).

On Tue, Jan 2, 2024 at 7:54 PM John Call @.***> wrote:

Oops, I forgot to post the resolution to this which was discovered by Sean Webb in Jan 2023. Sean discovered that having two NICs up/online resulted in duplicated, dropped, and out-of-order packet delivery. Shutting down the second NIC resolved the issue -- however the procedure for shutting down the NIC changed between RHEL7 (ifdown eth0) and RHEL8 (ip link set eth0 down) Please note that nmcli con down eth0 command in RHEL8 is not sufficient because that command removes the IP configuration from the NIC, but doesn't set the link status to down. A custom NetworkManager dispatcher script https://man.archlinux.org/man/NetworkManager-dispatcher.8.en can be created to set the link status to down when the second/backup NIC is not being used.

Ok, when I was taking another look at this to get some RHEL7 vs RHEL8 packet captures, I found the issue So one thing we didn't show is that we actually have 2 Novra DVB receivers. I think the two paths look like this:

Novra1 ---> | switch1, port 20 (vlan.101) | | switch1, port 21 (vlan.101) | ---> Dell Server1 eno3/sbn1 | switch1, port 22 (vlan.101) | ---> Dell Server2 eno3/sbn1 | switch1, port 23 (vlan.101) | ---> Dell Server3 eno3/sbn1 == linux-rhv-bridge == VM (rhel8-vm1 eth0)

Novra2 ---> | switch2, port 20 (vlan.201) | | switch2, port 21 (vlan.201) | ---> Dell Server1 eno4/sbn2 | switch2, port 22 (vlan.201) | ---> Dell Server2 eno4/sbn2 | switch2, port 23 (vlan.201) | ---> Dell Server3 eno4/sbn2 == linux-rhv-bridge == VM (rhel8-vm1 eth1)

If I change one of the VM's NIC link state to DOWN then the GAPS GO AWAY! So this issue hasn't been that the data is coming in out of order on the interface, the issue is that BOTH interfaces are simultaneously broadcasting their multicast data even when we run "nmcli con down eth0". So LDM was receiving the multicast data from BOTH interfaces and seeing them out of order and discarding most of the data.

The difference is how we are managing the interface between rhel7 and rhel8. In rhel7 we were using ifdown eth0 to shut down the inactive sbn interface, in which case the interface looked like this: 4: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisk noop state DOWN group default qlen 1000 link/ether 56:6f:5d:e2:00:26 brd ff:ff:ff:ff:ff:ff

In rhel8, we were using "nmcli con down eth0" but the interface was still UP, it just didn't have an IP assigned 4: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 56:6f:5d:e2:00:26 brd ff:ff:ff:ff:ff:ff

— Reply to this email directly, view it on GitHub https://github.com/Unidata/LDM/issues/74#issuecomment-1874762981, or unsubscribe https://github.com/notifications/unsubscribe-auth/A3LXQSNQ7TBKD5X6CIILAY3YMS26FAVCNFSM4RXILTL2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOBXGQ3TMMRZHAYQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>