USNavalResearchLaboratory / norm

NACK-Oriented Reliable Multicast (NORM) implementation & tools (RFCs 5740, 5401)
https://www.nrl.navy.mil/Our-Work/Areas-of-Research/Information-Technology/NCS/NORM/
Other
96 stars 33 forks source link

Unable to send 10mb file #45

Closed agrossmann09 closed 1 year ago

agrossmann09 commented 2 years ago

I am attempting to send a 10mb file from my MacBook to my Raspberry Pi as the receiver. However, I get an error:

Proto Error: ProtoSocket::SendTo() sendto() error: Message too long

I attempted to enable the fragmentation setting within norm but that didn't make a difference. I also altered my multicast MTU limit on both of my boxes as I read that there is a limit imposed by the system by default. Using norm am I able to send message and files over 1500 bytes? And if so, what do I have to enable with norm or with my box?

bebopagogo commented 2 years ago

Are you using one of the norm example or demo applications to do this (e.g. "norm" or "normCast")? If so, which one? And what command-line options are you using?

WIth NORM, the only reason you would get this message is if you have set the NORM "segment size" to a value larger than what your system allows for UDP datagrams (Note that NORM does add some headers per UDP packet it sends so the segment size must be a little less than that limit). The "segment size" is independent of the the file object size. The NORM sender "segment size" is set as part of the NormStartSender() API call. You should not be passing anything related to file size to that API call. (My guess is this would be likely what your issue is)

For a network with a 1500 byte MTU, the max UDP payload size (without IP fragmentation) is 1500 - 28 bytes which is 1472 bytes. As mentioned, you need to allow some space per UDP packet for the additional NORM protocol headers (NORM_DATA messages). By default, NORM includes the FEC Object Transport Information (FEC OTI) header extension in every packet (16 bytes) in addition to the NORM_DATA header so you should allow abut 40 bytes (There is an API call to reduce this overhead by causing NORM to only include the FEC OTI in NORM_INFO packets,

agrossmann09 commented 2 years ago

I am not sure what changed in my code, I will blame it on the lack of sleep due to my newborn son, but now I am able to send 10mb files. I have my segment size set at 1024 to be within a safe limit. I do have a follow up issue related to the sending of moderately sized files. The rate at which the file is send is really slow ~ 10 min for a 10mb file. Is there a setting with norm to increase the rate of transfer?

bebopagogo commented 2 years ago

The default fixed transmission rate for NORM is pretty slow … perhaps something like 64 kbps.  There is an API call to set the transmit rate NormSetTxRate() and you need to set that to something a little less than your link rate to allow for IP/Ethernet or other header overhead so you don’t overwhelm the link.

NORM also supports automated rate control (aka “congestion control”) similar to TCP and there are API calls to activate congestion control operation.  The basic “TCP friendly” congestion control works very similarly to TCP and can also suffer similar issues as TCP in lossy, wireless environments.   There is a “loss tolerant” option for NORM congestion control that is still TCP friendly and a little more robust in wireless (or other) network connections with some modest packet loss.  Finally there is an ECN (explicit congestion notification) mode that is very resilient to bit-error induced packet loss but requires that intermediate systems are configured to conduct RED-ECN queuing with ECN “marking” enabled.  

Where you know the link rate and can configure NORM for “fixed rate” operation that is also very immune to packet loss (rate is fixed and pushes through packet loss where TCP (or similar) congestion control schemes mistake bit-error induced packet loss for congestion and reduce their transmission rate unnecessarily.

Cheers,

Brian

From: Alessandro Grossmann @.> Reply-To: USNavalResearchLaboratory/norm @.> Date: Wednesday, November 10, 2021 at 8:16 AM To: USNavalResearchLaboratory/norm @.> Cc: Brian Adamson @.>, Comment @.***> Subject: Re: [USNavalResearchLaboratory/norm] Unable to send 10mb file (Issue #45)

I am not sure what changed in my code, I will blame it on the lack of sleep due to my newborn son, but now I am able to send 10mb files. I have my segment size set at 1024 to be within a safe limit. I do have a follow up issue related to the sending of moderately sized files. The rate at which the file is send is really slow ~ 10 min for a 10mb file. Is there a setting with norm to increase the rate of transfer?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

agrossmann09 commented 2 years ago
    Okay that definitely helped with my transfer speeds. I must have missed the default speeds in the documentation. What is the API to use this method:

    There is a “loss tolerant” option for NORM congestion control that is still TCP friendly and a little more robust in wireless (or other) network connections with some modest packet loss."

    Would that be setting the session.setCongestionControl(true, true) or (true, false)?
bebopagogo commented 2 years ago

To invoke the loss tolerant congestion control option, you need to invoke an additional API call since the ECN (aka NORM-CCE) and loss-tolerant (NORM-CCL) are more experimental ( but mature) extensions to the NORM-CC (TCP-friendly congestion control that is part of the IETF spec).

To turn on congestion control, you use (C/C++) API:

NormSetCongestionControl(session, true, true);

To enable the loss-tolerant congestion control extension, you need to also call:

NormSetEcnSupport(session, false, false, true);

Since it was done as experimental modes, the “loss tolerant” option was added to the ECN control function instead of a separate control.  I would have to check to see if this extra function is part of the Java binding.  If it’s not, it  can be easily added.

From: Alessandro Grossmann @.> Reply-To: USNavalResearchLaboratory/norm @.> Date: Friday, November 12, 2021 at 10:28 AM To: USNavalResearchLaboratory/norm @.> Cc: Brian Adamson @.>, Comment @.***> Subject: Re: [USNavalResearchLaboratory/norm] Unable to send 10mb file (Issue #45)

    Okay that definitely helped with my transfer speeds. I must have missed the default speeds in the documentation. What is the API to use this method:

    There is a “loss tolerant” option for NORM congestion control that is still TCP friendly and a little more robust in wireless (or other) network connections with some modest packet loss."

    Would that be setting the session.setCongestionControl(true, true) or (true, false)?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

agrossmann09 commented 2 years ago

Okay both of those are within the Java binding. I am getting the best transfer rate with setting the Tx manually so I will continue with that. However I am encountering a situation where the file is sent across and I can tell all the bytes are there, but the receiver never registers an object complete. So it hangs. A small file works fine so I don't think it is my network that is the problem.

agrossmann09 commented 2 years ago

I looked through the bytes pending and it seems when the object has send all the bytes and I get the NORM_RX_OBJECT_UPDATED for the first time, the bytes pending starts to decrement. For example if I send my 10MB file, I get this output.

Screen Shot 2021-11-17 at 12 38 37 PM

So the file on disk is there, but the process never completes.

agrossmann09 commented 2 years ago
Screen Shot 2021-11-17 at 1 23 25 PM

I also get this debug output. So it seems that I am suffering from a NACK timeout issue?

bebopagogo commented 2 years ago

I am on travel and look at this in coming days. Does your sender side debug output show that the NACKs from the receivers are seen by the sender?

Sent from my iPhone

On Nov 17, 2021, at 11:24 AM, Alessandro Grossmann @.***> wrote:



I also get this debug output. So it seems that I am suffering from a NACK timeout issue?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

agrossmann09 commented 2 years ago

Okay no rush I was just updating the issue with some screenshots. No on the sender side I just get the REPORT time output. And on the receiver side I get the following output that continually cycles. So it looks like there is always a NACK pending with 0 completed. Then eventually it maxes out the repair requests.

4 node>2528396014 * Proto Info: Remote sender>1343672144 grtt>0.266300 sec loss>0.000000 Proto Info: rxRate> 0.000 kbps rx_goodput> 0.000 kbps Proto Info: rxObjects> completed>0 pending>1 failed>0 Proto Info: fecBufferUsage> current>129024 peak>129024 overuns>0 Proto Info: strBufferUsage> current>0 peak>0 overuns>0 Proto Info: resyncs>0 nacks>9 suppressed>0 Proto Info: ***** Proto Info: REPORT time>20:48:21.961432 node>2855367022 * Proto Info: Local status: Proto Info: txRate>500000.000 kbps sentRate> 0.000 grtt>0.266300 Proto Info: ***** Proto Info: NormSenderNode::OnActivityTimeout() node>2528396014 for sender>1343672144 Proto Debug: NormSenderNode::RepairCheck() node>2528396014 begin NACK backoff: 1.053415 sec)... Proto Debug: NormSenderNode::HandleCommand() node>2528396014 begin CC back-off: 0.753155 sec)... Proto Debug: NormSenderNode::OnRepairTimeout() node>2528396014 sender>1343672144 end NACK back-off ... Proto Debug: NormSenderNode::OnRepairTimeout() node>2528396014 sending NACK rate:0.590234 kbps (rtt:0.266300 loss:0.000000 s:257) slow_start:0 Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>0->15 (maxPending = 15) Proto Trace: NormRepairRequest::AppendRepairRange(0:0:49->0:0:77) ... Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>1->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>2->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>3->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>4->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>5->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() BLOCK request Proto Trace: NormRepairRequest::AppendRepairRange(0:1:0->0:4:0) ... Proto Trace: NormRepairRequest::AppendRepairRange(0:5:48->0:5:76) ... Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>6->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>7->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>8->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>9->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>10->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() BLOCK request Proto Trace: NormRepairRequest::AppendRepairRange(0:6:0->0:9:0) ... Proto Trace: NormRepairRequest::AppendRepairRange(0:10:16->0:10:34) ... Proto Trace: NormRepairRequest::AppendRepairRange(0:10:61->0:10:76) ... Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>11->15 (maxPending = 15) Proto Trace: NormRepairRequest::AppendRepairRange(0:11:22->0:11:76) ... Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>12->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>13->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>14->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>15->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() BLOCK request Proto Trace: NormRepairRequest::AppendRepairRange(0:12:0->0:14:0) ... Proto Trace: NormRepairRequest::AppendRepairRange(0:15:16->0:15:31) ... Proto Trace: NormRepairRequest::AppendRepairRange(0:15:61->0:15:76) ... Proto Debug: NormSenderNode::OnRepairTimeout() node>2528396014 sender>1343672144 begin NACK hold-off: 1.597800 sec ... Proto Debug: NormSenderNode::OnRepairTimeout() node>2528396014 sender>1343672144 end NACK hold-off ... Proto Info: REPORT time>20:48:31.961140 node>2528396014 * Proto Info: Remote sender>1343672144 grtt>0.266300 sec loss>0.000000 Proto Info: rxRate> 0.022 kbps rx_goodput> 0.000 kbps Proto Info: rxObjects> completed>0 pending>1 failed>0 Proto Info: fecBufferUsage> current>129024 peak>129024 overuns>0 Proto Info: strBufferUsage> current>0 peak>0 overuns>0 Proto Info: Proto Info: resyncs>0 nacks>10 suppressed>0 Proto Info: ***** REPORT time>20:48:31.961872 node>2855367022 * Proto Info: Local status: Proto Info: txRate>500000.000 kbps sentRate> 0.000 grtt>0.266300 Proto Info: ***** Proto Info: NormSenderNode::OnActivityTimeout() node>2528396014 for sender>1343672144 Proto Debug: NormSenderNode::RepairCheck() node>2528396014 begin NACK backoff: 1.020462 sec)... Proto Debug: NormSenderNode::OnRepairTimeout() node>2528396014 sender>1343672144 end NACK back-off ... Proto Debug: NormSenderNode::OnRepairTimeout() node>2528396014 sending NACK rate:0.590234 kbps (rtt:0.266300 loss:0.000000 s:244) slow_start:0 Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>0->15 (maxPending = 15) Proto Trace: NormRepairRequest::AppendRepairRange(0:0:49->0:0:77) ... Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>1->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>2->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>3->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>4->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>5->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() BLOCK request Proto Trace: NormRepairRequest::AppendRepairRange(0:1:0->0:4:0) ... Proto Trace: NormRepairRequest::AppendRepairRange(0:5:48->0:5:76) ... Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>6->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>7->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>8->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>9->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>10->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() BLOCK request Proto Trace: NormRepairRequest::AppendRepairRange(0:6:0->0:9:0) ... Proto Trace: NormRepairRequest::AppendRepairRange(0:10:16->0:10:34) ... Proto Trace: NormRepairRequest::AppendRepairRange(0:10:61->0:10:76) ... Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>11->15 (maxPending = 15) Proto Trace: NormRepairRequest::AppendRepairRange(0:11:22->0:11:76) ... Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>12->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>13->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>14->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>15->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() BLOCK request Proto Trace: NormRepairRequest::AppendRepairRange(0:12:0->0:14:0) ... Proto Trace: NormRepairRequest::AppendRepairRange(0:15:16->0:15:31) ... Proto Trace: NormRepairRequest::AppendRepairRange(0:15:61->0:15:76) ... Proto Debug: NormSenderNode::OnRepairTimeout() node>2528396014 sender>1343672144 begin NACK hold-off: 1.597800 sec ... Proto Info: REPORT time>20:48:41.962896 node>2528396014 Proto Info: Remote sender>1343672144 grtt>0.266300 sec loss>0.000000 Proto Info: rxRate> 0.000 kbps rx_goodput> 0.000 kbps Proto Info: rxObjects> completed>0 pending>1 failed>0 Proto Info: fecBufferUsage> current>129024 peak>129024 overuns>0 Proto Info: strBufferUsage> current>0 peak>0 overuns>0 Proto Info: Proto Info: REPORT time>20:48:41.963039 node>2855367022 Proto Info: Local status: Proto Info: txRate>500000.000 kbps sentRate> 0.022 grtt>0.266300 Proto Info: resyncs>0 nacks>11 suppressed>0 Proto Info: Proto Debug: NormSenderNode::OnRepairTimeout() node>2528396014 sender>1343672144 end NACK hold-off ...

bebopagogo commented 2 years ago

You can run the trace option or higher debug level for more details at your sender

Sent from my iPhone

On Nov 17, 2021, at 1:58 PM, Alessandro Grossmann @.***> wrote:  Okay no rush I was just updating the issue with some screenshots. No on the sender side I just get the REPORT time output. And on the receiver side I get the following output that continually cycles. So it looks like there is always a NACK pending with 0 completed. Then eventually it maxes out the repair requests.

4 node>2528396014 * Proto Info: Remote sender>1343672144 grtt>0.266300 sec loss>0.000000 Proto Info: rxRate> 0.000 kbps rx_goodput> 0.000 kbps Proto Info: rxObjects> completed>0 pending>1 failed>0 Proto Info: fecBufferUsage> current>129024 peak>129024 overuns>0 Proto Info: strBufferUsage> current>0 peak>0 overuns>0 Proto Info: resyncs>0 nacks>9 suppressed>0 Proto Info: ***** Proto Info: REPORT time>20:48:21.961432 node>2855367022 * Proto Info: Local status: Proto Info: txRate>500000.000 kbps sentRate> 0.000 grtt>0.266300 Proto Info: ***** Proto Info: NormSenderNode::OnActivityTimeout() node>2528396014 for sender>1343672144 Proto Debug: NormSenderNode::RepairCheck() node>2528396014 begin NACK backoff: 1.053415 sec)... Proto Debug: NormSenderNode::HandleCommand() node>2528396014 begin CC back-off: 0.753155 sec)... Proto Debug: NormSenderNode::OnRepairTimeout() node>2528396014 sender>1343672144 end NACK back-off ... Proto Debug: NormSenderNode::OnRepairTimeout() node>2528396014 sending NACK rate:0.590234 kbps (rtt:0.266300 loss:0.000000 s:257) slow_start:0 Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>0->15 (maxPending = 15) Proto Trace: NormRepairRequest::AppendRepairRange(0:0:49->0:0:77) ... Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>1->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>2->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>3->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>4->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>5->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() BLOCK request Proto Trace: NormRepairRequest::AppendRepairRange(0:1:0->0:4:0) ... Proto Trace: NormRepairRequest::AppendRepairRange(0:5:48->0:5:76) ... Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>6->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>7->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>8->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>9->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>10->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() BLOCK request Proto Trace: NormRepairRequest::AppendRepairRange(0:6:0->0:9:0) ... Proto Trace: NormRepairRequest::AppendRepairRange(0:10:16->0:10:34) ... Proto Trace: NormRepairRequest::AppendRepairRange(0:10:61->0:10:76) ... Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>11->15 (maxPending = 15) Proto Trace: NormRepairRequest::AppendRepairRange(0:11:22->0:11:76) ... Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>12->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>13->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>14->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>15->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() BLOCK request Proto Trace: NormRepairRequest::AppendRepairRange(0:12:0->0:14:0) ... Proto Trace: NormRepairRequest::AppendRepairRange(0:15:16->0:15:31) ... Proto Trace: NormRepairRequest::AppendRepairRange(0:15:61->0:15:76) ... Proto Debug: NormSenderNode::OnRepairTimeout() node>2528396014 sender>1343672144 begin NACK hold-off: 1.597800 sec ... Proto Debug: NormSenderNode::OnRepairTimeout() node>2528396014 sender>1343672144 end NACK hold-off ... Proto Info: REPORT time>20:48:31.961140 node>2528396014 * Proto Info: Remote sender>1343672144 grtt>0.266300 sec loss>0.000000 Proto Info: rxRate> 0.022 kbps rx_goodput> 0.000 kbps Proto Info: rxObjects> completed>0 pending>1 failed>0 Proto Info: fecBufferUsage> current>129024 peak>129024 overuns>0 Proto Info: strBufferUsage> current>0 peak>0 overuns>0 Proto Info: Proto Info: resyncs>0 nacks>10 suppressed>0 Proto Info: ***** REPORT time>20:48:31.961872 node>2855367022 * Proto Info: Local status: Proto Info: txRate>500000.000 kbps sentRate> 0.000 grtt>0.266300 Proto Info: ***** Proto Info: NormSenderNode::OnActivityTimeout() node>2528396014 for sender>1343672144 Proto Debug: NormSenderNode::RepairCheck() node>2528396014 begin NACK backoff: 1.020462 sec)... Proto Debug: NormSenderNode::OnRepairTimeout() node>2528396014 sender>1343672144 end NACK back-off ... Proto Debug: NormSenderNode::OnRepairTimeout() node>2528396014 sending NACK rate:0.590234 kbps (rtt:0.266300 loss:0.000000 s:244) slow_start:0 Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>0->15 (maxPending = 15) Proto Trace: NormRepairRequest::AppendRepairRange(0:0:49->0:0:77) ... Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>1->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>2->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>3->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>4->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>5->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() BLOCK request Proto Trace: NormRepairRequest::AppendRepairRange(0:1:0->0:4:0) ... Proto Trace: NormRepairRequest::AppendRepairRange(0:5:48->0:5:76) ... Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>6->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>7->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>8->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>9->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>10->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() BLOCK request Proto Trace: NormRepairRequest::AppendRepairRange(0:6:0->0:9:0) ... Proto Trace: NormRepairRequest::AppendRepairRange(0:10:16->0:10:34) ... Proto Trace: NormRepairRequest::AppendRepairRange(0:10:61->0:10:76) ... Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>11->15 (maxPending = 15) Proto Trace: NormRepairRequest::AppendRepairRange(0:11:22->0:11:76) ... Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>12->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>13->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>14->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() node>2528396014 obj>0, blk>15->15 (maxPending = 15) Proto Trace: NormObject::AppendRepairRequest() BLOCK request Proto Trace: NormRepairRequest::AppendRepairRange(0:12:0->0:14:0) ... Proto Trace: NormRepairRequest::AppendRepairRange(0:15:16->0:15:31) ... Proto Trace: NormRepairRequest::AppendRepairRange(0:15:61->0:15:76) ... Proto Debug: NormSenderNode::OnRepairTimeout() node>2528396014 sender>1343672144 begin NACK hold-off: 1.597800 sec ... Proto Info: REPORT time>20:48:41.962896 node>2528396014 Proto Info: Remote sender>1343672144 grtt>0.266300 sec loss>0.000000 Proto Info: rxRate> 0.000 kbps rx_goodput> 0.000 kbps Proto Info: rxObjects> completed>0 pending>1 failed>0 Proto Info: fecBufferUsage> current>129024 peak>129024 overuns>0 Proto Info: strBufferUsage> current>0 peak>0 overuns>0 Proto Info: Proto Info: REPORT time>20:48:41.963039 node>2855367022 Proto Info: Local status: Proto Info: txRate>500000.000 kbps sentRate> 0.022 grtt>0.266300 Proto Info: resyncs>0 nacks>11 suppressed>0 Proto Info: Proto Debug: NormSenderNode::OnRepairTimeout() node>2528396014 sender>1343672144 end NACK hold-off ...

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

agrossmann09 commented 2 years ago

I have both my sender and recv up to debug level 12. I also encounter an issue where one node can send to the other but the other node can only receive messages. From the debug output when I try to send a message the output is the same

Screen Shot 2021-11-19 at 9 13 26 AM

Where the queue is flushed. But eventually on the node that doesn't work I get a onActivityTimeout message. And in this case the receiver that is working correctly doesn't have any debug output showing a message, NACk, etc is being generated. This is not the case all the time. I don't know what triggers this seemingly random issue.

agrossmann09 commented 2 years ago

I was able to figure out that I needed to improve the TX rate on the sender in order to send a file successfully. However, I am still having this intermittent issue where in a system of 3 nodes:

Node 1 can send a message to Node 2 and 3 Node 2 can send a message to Node 1 and 3 Node 3 can only send a message to Node 1

Node 1: Macbook Node 2: Android Node 3: Raspberry pi

agrossmann09 commented 2 years ago

After running a couple more tests I think there is still an issue with the file send speed. I figured out how to get the 10mb file to be sent, but it takes anywhere from 5-15 minutes. I noticed that the file labeled normTemp* is sent over immediately. The larger problem is that the recv continues to get OBJECT_UPDATED events. Eventually the process does complete but I am not sure what is going on behind the scenes that is causing the large delay.

bebopagogo commented 2 years ago

Is it possible for you to capture a log file from the sender and receiver with the the “trace” option enabled?  Also, if you are able to get pcap files from the sender and receiver each, I can make use of those, too.  I have some extra (undocumented) tools I use to analyze the trace files.

In the “normCast” or “norm” demo app,  “normTemp” file is a name temporarily assigned to received file objects until the “NORM_INFO” message that is used to convey the file name (that is how those example/demo apps use the NORM_INFO … other strategies could be used including not using the NORM_INFO at all and using an archive tool/library to encode transmitted files to include additional file meta data like creation date, etc. if desired).

If your connection has high packet loss and much of an RTT, the “end game” for NORM can be slow to finalize transmission.  The NACK process is paced by the round trip timing and if NACKs or requested repair messages are lost (due to packet loss), it can take multiple rounds of the NACK process to finalize transfer.  For small multicast group sizes, this can be mitigated by setting the “backoff factor” to zero to reduce the NACK suppression/aggregation timeouts.  Those timeouts are part of the feedback suppression mechanism that allows NORM to operate with large group sizes and maximize ARQ efficiency (in terms of loading the network as minimally as possible) but at the expense of delay.  By setting the backoff factors to zero you can favor lower delay over efficiency.  Another option to further reduce repair delay (but also at the expense of more inefficient operation) is to turn off NORM’s usual GRTT probing (used to measure group round trip time) and set a fixed, small GRTT value at the sender.  Again, this tunes NORM for rapid repair but less efficiency due to the potential for extraneous NACKs and redundant repair transmissions when the NACK/repair timeouts aren’t well matched to the current network RTT

At higher debug levels, the NORM debug output can provide more insight on what it is doing and prints out the measured round trip times observed and the details of the receiver NACK “backoff timeout”, the sender NACK “aggregation timeout” and “holdoff timeout” events and intervals that define the ARQ protocol operation.  Coupled with the “trace” output (per packet send/receive event logging), you can directly observe what’s happening behind the scenes.

Cheers,

Brian

From: Alessandro Grossmann @.> Reply-To: USNavalResearchLaboratory/norm @.> Date: Monday, November 22, 2021 at 8:24 AM To: USNavalResearchLaboratory/norm @.> Cc: Brian Adamson @.>, Comment @.***> Subject: Re: [USNavalResearchLaboratory/norm] Unable to send 10mb file (Issue #45)

After running a couple more tests I think there is still an issue with the file send speed. I figured out how to get the 10mb file to be sent, but it takes anywhere from 5-15 minutes. I noticed that the file labeled normTemp* is sent over immediately. The larger problem is that the recv continues to get OBJECT_UPDATED events. Eventually the process does complete but I am not sure what is going on behind the scenes that is causing the large delay.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

agrossmann09 commented 2 years ago

I will work on getting those trace/debug outputs between my Android and Pi. It doesn't occur often so hopefully I can capture the issue. In the meantime thank you for explaining the tail end of the norm file transfer process. That part was unclear to me. I have tried messing with the GRTT and setting the backoff factor to 0 which did help a little bit so I understand why that is now. Knowing the network that I will be applying norm in it will probably be best not to change anything that best fits smaller networks since I will be dealing with a larger network and need semi-reliable connection. I can continue to play with the GRTT to find a sweet spot in my network.

A quick question about the trace option. Is that built into your norm tool or is it a build option into the java JNI libraries and dylib? I am currently using the java JNI to build my own command line debugging tool alongside an Android app as well.

agrossmann09 commented 2 years ago

Also could setting the DefaultRepairBoundary to repair once the Object is complete with NormRepairBoundary.Norm_BOUNDARY_OBJECT option help with file transfer speeds? I noticed a small improvement ~ 30 seconds but that could be a coincidence based on my network speeds.

agrossmann09 commented 2 years ago

This is my output on the Mac -> Not receiving messages from Android (Pi is not receiving messages as well)

Proto Error: NormApp::Notify() FILE/DATA objects not _yet_ supported...
Proto Error: NormSenderNode::HandleObjectMessage() object not accepted
Proto Error: NormApp::Notify() FILE/DATA objects not _yet_ supported...
Proto Error: NormSenderNode::HandleObjectMessage() object not accepted
^[[A^C[alessandrogrossmann@192.168.0.6] ~/norm/bin:(android_build_gradle)$ ./norm address 224.1.2.3/4004 trace on output STDIN
trace>18:01:15.124135 node>184312376 src>192.168.0.5/54212 inst>21457 seq>89 INFO obj>3 len>32 
Proto Error: NormApp::Notify() FILE/DATA objects not _yet_ supported...
Proto Error: NormSenderNode::HandleObjectMessage() object not accepted
trace>18:01:15.126220 node>184312376 src>192.168.0.5/54212 inst>21457 seq>90 DATA obj>3 blk>0 seg>0 len>39 
trace>18:01:15.127626 node>184312376 src>192.168.0.5/54212 inst>21457 seq>91 CMD(FLUSH)  obj>3 blk>0 seg>0 len>20 
trace>18:01:15.634430 node>184312376 src>192.168.0.5/54212 inst>21457 seq>92 CMD(FLUSH)  obj>3 blk>0 seg>0 len>20 
trace>18:01:16.147081 node>184312376 src>192.168.0.5/54212 inst>21457 seq>93 CMD(FLUSH)  obj>3 blk>0 seg>0 len>20 
trace>18:01:16.659045 node>184312376 src>192.168.0.5/54212 inst>21457 seq>94 CMD(FLUSH)  obj>3 blk>0 seg>0 len>20 
trace>18:01:17.171223 node>184312376 src>192.168.0.5/54212 inst>21457 seq>95 CMD(FLUSH)  obj>3 blk>0 seg>0 len>20 
trace>18:01:17.683043 node>184312376 src>192.168.0.5/54212 inst>21457 seq>96 CMD(FLUSH)  obj>3 blk>0 seg>0 len>20 
trace>18:01:18.297502 node>184312376 src>192.168.0.5/54212 inst>21457 seq>97 CMD(FLUSH)  obj>3 blk>0 seg>0 len>20 
trace>18:01:18.808782 node>184312376 src>192.168.0.5/54212 inst>21457 seq>98 CMD(FLUSH)  obj>3 blk>0 seg>0 len>20 
trace>18:01:19.321448 node>184312376 src>192.168.0.5/54212 inst>21457 seq>99 CMD(FLUSH)  obj>3 blk>0 seg>0 len>20 
trace>18:01:19.833049 node>184312376 src>192.168.0.5/54212 inst>21457 seq>100 CMD(FLUSH)  obj>3 blk>0 seg>0 len>20 
trace>18:01:20.345208 node>184312376 src>192.168.0.5/54212 inst>21457 seq>101 CMD(FLUSH)  obj>3 blk>0 seg>0 len>20 
trace>18:01:20.959915 node>184312376 src>192.168.0.5/54212 inst>21457 seq>102 CMD(FLUSH)  obj>3 blk>0 seg>0 len>20 
trace>18:01:21.471574 node>184312376 src>192.168.0.5/54212 inst>21457 seq>103 CMD(FLUSH)  obj>3 blk>0 seg>0 len>20 
trace>18:01:21.983946 node>184312376 src>192.168.0.5/54212 inst>21457 seq>104 CMD(FLUSH)  obj>3 blk>0 seg>0 len>20 
trace>18:01:22.495910 node>184312376 src>192.168.0.5/54212 inst>21457 seq>105 CMD(FLUSH)  obj>3 blk>0 seg>0 len>20 
trace>18:01:23.007642 node>184312376 src>192.168.0.5/54212 inst>21457 seq>106 CMD(FLUSH)  obj>3 blk>0 seg>0 len>20 
trace>18:01:23.624159 node>184312376 src>192.168.0.5/54212 inst>21457 seq>107 CMD(FLUSH)  obj>3 blk>0 seg>0 len>20 
trace>18:01:24.236384 node>184312376 src>192.168.0.5/54212 inst>21457 seq>108 CMD(FLUSH)  obj>3 blk>0 seg>0 len>20 
trace>18:01:24.646047 node>184312376 src>192.168.0.5/54212 inst>21457 seq>109 CMD(FLUSH)  obj>3 blk>0 seg>0 len>20 
trace>18:01:25.157790 node>184312376 src>192.168.0.5/54212 inst>21457 seq>110 CMD(FLUSH)  obj>3 blk>0 seg>0 len>20 
trace>18:01:36.729385 node>184312376 src>192.168.0.5/54212 inst>21457 seq>111 CMD(CC)  seq>10 len>24 
^C[alessandrogrossmann@192.168.0.6] ~/norm/bin:(android_build_gradle)$ ./norm address 224.1.2.3/4004 trace o
Proto Fatal: NormApp::OnCommand(trace) invalid argument!
Proto Fatal: NormApp::ProcessCommands() OnCommand(trace, o) error
Proto Fatal: NormApp::OnStartup() error processing command-line commands
[alessandrogrossmann@192.168.0.6] ~/norm/bin:(android_build_gradle)$ ./norm address 224.1.2.3/4004 
Proto Fatal: NormApp::OnStartup() Error! no "input", "output", "sendfile", "stest", "mtest", or "rxcache" given.
[alessandrogrossmann@192.168.0.6] ~/norm/bin:(android_build_gradle)$ ./norm address 224.1.2.3/4004 trace on rxcache ./
trace>18:02:06.734382 node>184312376 src>192.168.0.5/54212 inst>21457 seq>112 CMD(CC)  seq>11 len>24 
trace>18:02:29.057321 node>184312376 src>192.168.0.5/54212 inst>21457 seq>113 INFO obj>4 len>41 
trace>18:02:29.060265 node>184312376 src>192.168.0.5/54212 inst>21457 seq>114 DATA obj>4 blk>0 seg>0 len>50 
trace>18:02:29.061782 node>184312376 src>192.168.0.5/54212 inst>21457 seq>115 CMD(FLUSH)  obj>4 blk>0 seg>0 len>20 
trace>18:02:29.569343 node>184312376 src>192.168.0.5/54212 inst>21457 seq>116 CMD(FLUSH)  obj>4 blk>0 seg>0 len>20 
trace>18:02:30.183259 node>184312376 src>192.168.0.5/54212 inst>21457 seq>117 CMD(FLUSH)  obj>4 blk>0 seg>0 len>20 
trace>18:02:30.695692 node>184312376 src>192.168.0.5/54212 inst>21457 seq>118 CMD(FLUSH)  obj>4 blk>0 seg>0 len>20 
trace>18:02:31.207688 node>184312376 src>192.168.0.5/54212 inst>21457 seq>119 CMD(FLUSH)  obj>4 blk>0 seg>0 len>20 
trace>18:02:31.720987 node>184312376 src>192.168.0.5/54212 inst>21457 seq>120 CMD(FLUSH)  obj>4 blk>0 seg>0 len>20 
trace>18:02:32.231109 node>184312376 src>192.168.0.5/54212 inst>21457 seq>121 CMD(FLUSH)  obj>4 blk>0 seg>0 len>20 
trace>18:02:32.845626 node>184312376 src>192.168.0.5/54212 inst>21457 seq>122 CMD(FLUSH)  obj>4 blk>0 seg>0 len>20 
trace>18:02:33.357819 node>184312376 src>192.168.0.5/54212 inst>21457 seq>123 CMD(FLUSH)  obj>4 blk>0 seg>0 len>20 
trace>18:02:33.869933 node>184312376 src>192.168.0.5/54212 inst>21457 seq>124 CMD(FLUSH)  obj>4 blk>0 seg>0 len>20 
trace>18:02:34.381706 node>184312376 src>192.168.0.5/54212 inst>21457 seq>125 CMD(FLUSH)  obj>4 blk>0 seg>0 len>20 
trace>18:02:34.893793 node>184312376 src>192.168.0.5/54212 inst>21457 seq>126 CMD(FLUSH)  obj>4 blk>0 seg>0 len>20 
trace>18:02:35.508288 node>184312376 src>192.168.0.5/54212 inst>21457 seq>127 CMD(FLUSH)  obj>4 blk>0 seg>0 len>20 
trace>18:02:36.023902 node>184312376 src>192.168.0.5/54212 inst>21457 seq>128 CMD(FLUSH)  obj>4 blk>0 seg>0 len>20 
trace>18:02:36.532143 node>184312376 src>192.168.0.5/54212 inst>21457 seq>129 CMD(FLUSH)  obj>4 blk>0 seg>0 len>20 
trace>18:02:36.634289 node>184312376 src>192.168.0.5/54212 inst>21457 seq>130 CMD(CC)  seq>12 len>24 
trace>18:02:37.044395 node>184312376 src>192.168.0.5/54212 inst>21457 seq>131 CMD(FLUSH)  obj>4 blk>0 seg>0 len>20 
trace>18:02:37.555955 node>184312376 src>192.168.0.5/54212 inst>21457 seq>132 CMD(FLUSH)  obj>4 blk>0 seg>0 len>20 
trace>18:02:38.171028 node>184312376 src>192.168.0.5/54212 inst>21457 seq>133 CMD(FLUSH)  obj>4 blk>0 seg>0 len>20 
trace>18:02:38.682886 node>184312376 src>192.168.0.5/54212 inst>21457 seq>134 CMD(FLUSH)  obj>4 blk>0 seg>0 len>20 
trace>18:02:39.195083 node>184312376 src>192.168.0.5/54212 inst>21457 seq>135 CMD(FLUSH)  obj>4 blk>0 seg>0 len>20 
^C[alessandrogrossmann@192.168.0.6] ~/norm/bin:(android_build_gradle)$ ./norm address 224.1.2.3/4004 trace on rxcache ./ debug 12
Proto Info: ProtoDebug>SetDebugLevel: debug level changed from 1 to 12
Proto Warn: ProtoAddress::ResolveFromString() getaddrinfo() error: Undefined error: 0
Proto Warn: ProtoSocket::EnableRecvDstAddr() setsocktopt(IPV6_PKTINFO) error: Invalid argument
Proto Info: REPORT time>18:03:20.207403 node>184312376 ***************************************
Proto Info: ***************************************************************************
trace>18:03:26.813415 node>184312376 src>192.168.0.5/54212 inst>21457 seq>137 INFO obj>5 len>40 
Proto Debug: NormSession::ReceiverHandleObjectMessage() node>184312376 new remote sender:3217917415 ...
Proto Debug: NormApp::Notify() unhandled event: 10
Proto Debug: NormApp::Notify(REMOTE_SENDER_ACTIVE) ...
Proto Debug: NormApp::Notify(RX_OBJECT_NEW) ...
Proto Info: 18:03:26.813640 start rx object>5 sender>3217917415
Proto Detail: NormSenderNode::HandleObjectMessage() node>184312376 sender>3217917415 new obj>5
Proto Debug: NormApp::Notify(RX_OBJECT_INFO) ...
trace>18:03:26.815305 node>184312376 src>192.168.0.5/54212 inst>21457 seq>138 DATA obj>5 blk>0 seg>0 len>50 
Proto Debug: NormSenderNode::HandleObjectMessage() node>184312376 allocating sender>3217917415 buffers ...
Proto Detail: NormObject::HandleObjectMessage() node>184312376 sender>3217917415 obj>5 blk>0 completed block ...
Proto Trace: NormApp::Notify(RX_OBJECT_UPDATED) ...
Proto Debug: NormApp::Notify(RX_OBJECT_COMPLETED) ...
Proto Info: 18:03:26.817820 completed rx object>5 trace>18:03:26.817853 node>184312376 src>192.168.0.5/54212 inst>21457 seq>139 CMD(FLUSH)  obj>5 blk>0 seg>0 len>20 
trace>18:03:27.323951 node>184312376 src>192.168.0.5/54212 inst>21457 seq>140 CMD(FLUSH)  obj>5 blk>0 seg>0 len>20 
trace>18:03:27.938330 node>184312376 src>192.168.0.5/54212 inst>21457 seq>141 CMD(FLUSH)  obj>5 blk>0 seg>0 len>20 
trace>18:03:28.347575 node>184312376 src>192.168.0.5/54212 inst>21457 seq>142 CMD(FLUSH)  obj>5 blk>0 seg>0 len>20 
trace>18:03:28.962743 node>184312376 src>192.168.0.5/54212 inst>21457 seq>143 CMD(FLUSH)  obj>5 blk>0 seg>0 len>20 
trace>18:03:29.473880 node>184312376 src>192.168.0.5/54212 inst>21457 seq>144 CMD(FLUSH)  obj>5 blk>0 seg>0 len>20 
trace>18:03:29.986267 node>184312376 src>192.168.0.5/54212 inst>21457 seq>145 CMD(FLUSH)  obj>5 blk>0 seg>0 len>20 
Proto Info: REPORT time>18:03:30.210882 node>184312376 ***************************************
Proto Info: Remote sender>3217917415 grtt>0.266300 sec loss>0.000000
Proto Info:    rxRate>    0.184 kbps rx_goodput>    0.014 kbps
Proto Info:    rxObjects> completed>1 pending>0 failed>0
Proto Info:    fecBufferUsage> current>0 peak>1024 overuns>0
Proto Info:    strBufferUsage> current>0 peak>0 overuns>0
Proto Info:    resyncs>0 nacks>0 suppressed>0
Proto Info: ***************************************************************************
trace>18:03:30.497888 node>184312376 src>192.168.0.5/54212 inst>21457 seq>146 CMD(FLUSH)  obj>5 blk>0 seg>0 len>20 
trace>18:03:31.010456 node>184312376 src>192.168.0.5/54212 inst>21457 seq>147 CMD(FLUSH)  obj>5 blk>0 seg>0 len>20 
trace>18:03:31.624640 node>184312376 src>192.168.0.5/54212 inst>21457 seq>148 CMD(FLUSH)  obj>5 blk>0 seg>0 len>20 
trace>18:03:32.136622 node>184312376 src>192.168.0.5/54212 inst>21457 seq>149 CMD(FLUSH)  obj>5 blk>0 seg>0 len>20 
trace>18:03:32.648200 node>184312376 src>192.168.0.5/54212 inst>21457 seq>150 CMD(FLUSH)  obj>5 blk>0 seg>0 len>20 
trace>18:03:33.160622 node>184312376 src>192.168.0.5/54212 inst>21457 seq>151 CMD(FLUSH)  obj>5 blk>0 seg>0 len>20 
trace>18:03:33.672628 node>184312376 src>192.168.0.5/54212 inst>21457 seq>152 CMD(FLUSH)  obj>5 blk>0 seg>0 len>20 
trace>18:03:34.286640 node>184312376 src>192.168.0.5/54212 inst>21457 seq>153 CMD(FLUSH)  obj>5 blk>0 seg>0 len>20 
trace>18:03:34.799242 node>184312376 src>192.168.0.5/54212 inst>21457 seq>154 CMD(FLUSH)  obj>5 blk>0 seg>0 len>20 
trace>18:03:35.414008 node>184312376 src>192.168.0.5/54212 inst>21457 seq>155 CMD(FLUSH)  obj>5 blk>0 seg>0 len>20 
trace>18:03:35.823568 node>184312376 src>192.168.0.5/54212 inst>21457 seq>156 CMD(FLUSH)  obj>5 blk>0 seg>0 len>20 
trace>18:03:36.438039 node>184312376 src>192.168.0.5/54212 inst>21457 seq>157 CMD(FLUSH)  obj>5 blk>0 seg>0 len>20 
trace>18:03:36.642469 node>184312376 src>192.168.0.5/54212 inst>21457 seq>158 CMD(CC)  seq>14 len>24 
trace>18:03:36.949504 node>184312376 src>192.168.0.5/54212 inst>21457 seq>159 CMD(FLUSH)  obj>5 blk>0 seg>0 len>20 
Proto Info: REPORT time>18:03:40.212711 node>184312376 ***************************************
Proto Info: Remote sender>3217917415 grtt>0.266300 sec loss>0.000000
Proto Info:    rxRate>    0.227 kbps rx_goodput>    0.000 kbps
Proto Info:    rxObjects> completed>1 pending>0 failed>0
Proto Info:    fecBufferUsage> current>0 peak>1024 overuns>0
Proto Info:    strBufferUsage> current>0 peak>0 overuns>0
Proto Info:    resyncs>0 nacks>0 suppressed>0
Proto Info: ***************************************************************************
Proto Info: NormSenderNode::OnActivityTimeout() node>184312376 for sender>3217917415
Proto Info: REPORT time>18:03:50.222018 node>184312376 ***************************************
Proto Info: Remote sender>3217917415 grtt>0.266300 sec loss>0.000000
Proto Info:    rxRate>    0.000 kbps rx_goodput>    0.000 kbps
Proto Info:    rxObjects> completed>1 pending>0 failed>0
Proto Info:    fecBufferUsage> current>0 peak>1024 overuns>0
Proto Info:    strBufferUsage> current>0 peak>0 overuns>0
Proto Info:    resyncs>0 nacks>0 suppressed>0
Proto Info: ***************************************************************************
Proto Info: NormSenderNode::OnActivityTimeout() node>184312376 for sender>3217917415
Proto Info: REPORT time>18:04:00.225884 node>184312376 ***************************************
Proto Info: Remote sender>3217917415 grtt>0.266300 sec loss>0.000000
Proto Info:    rxRate>    0.000 kbps rx_goodput>    0.000 kbps
Proto Info:    rxObjects> completed>1 pending>0 failed>0
Proto Info:    fecBufferUsage> current>0 peak>1024 overuns>0
Proto Info:    strBufferUsage> current>0 peak>0 overuns>0
Proto Info:    resyncs>0 nacks>0 suppressed>0
Proto Info: ***************************************************************************

This is the debug level 12 output from my Android. I don't have the trace option.

2021-11-22 13:03:29.694 9171-9171/com.example.normapp D/ViewRootImpl@d1b8a1b[MainActivity]: ViewPostIme pointer 0
2021-11-22 13:03:29.784 9171-9171/com.example.normapp D/ViewRootImpl@d1b8a1b[MainActivity]: ViewPostIme pointer 1
2021-11-22 13:03:29.800 9171-9171/com.example.normapp D/InputMethodManager: HSIFW - flag : 0
2021-11-22 13:03:29.806 9171-9171/com.example.normapp D/InputMethodManager: prepareNavigationBarInfo() DecorView@7ce76be[MainActivity]
2021-11-22 13:03:29.806 9171-9171/com.example.normapp D/InputMethodManager: getNavigationBarColor() -855310
2021-11-22 13:03:29.806 9171-9171/com.example.normapp V/InputMethodManager: Starting input: tba=com.example.normapp ic=com.android.internal.widget.EditableInputConnection@209d61c mNaviBarColor -855310 mIsGetNaviBarColorSuccess true , NavVisible : true , NavTrans : false
2021-11-22 13:03:29.806 9171-9171/com.example.normapp D/InputMethodManager: startInputInner - Id : 0
2021-11-22 13:03:29.809 9171-9171/com.example.normapp I/InputMethodManager: startInputInner - mService.startInputOrWindowGainedFocus
2021-11-22 13:03:29.811 9171-9171/com.example.normapp D/InputTransport: Input channel destroyed: 'ClientS', fd=88
2021-11-22 13:03:29.813 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() node>3217917415, flush queued (flush_count:1)...
2021-11-22 13:03:29.813 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() starting flush timeout: 0.532600 sec ....
2021-11-22 13:03:29.823 9171-9171/com.example.normapp W/IInputConnectionWrapper: requestCursorAnchorInfo on inactive InputConnection
2021-11-22 13:03:29.859 9171-9171/com.example.normapp W/IInputConnectionWrapper: getExtractedText on inactive InputConnection
2021-11-22 13:03:29.863 9171-9171/com.example.normapp W/IInputConnectionWrapper: getExtractedText on inactive InputConnection
2021-11-22 13:03:29.866 9171-9171/com.example.normapp W/IInputConnectionWrapper: getExtractedText on inactive InputConnection
2021-11-22 13:03:29.897 9171-9171/com.example.normapp D/ViewRootImpl@d1b8a1b[MainActivity]: MSG_RESIZED: frame=(0,0,1080,2400) ci=(0,78,0,144) vi=(0,78,0,144) or=1
2021-11-22 13:03:30.346 9171-9910/com.example.normapp D/protolib: NormSession::OnFlushTimeout() deactivating flush_timer ....
2021-11-22 13:03:30.346 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() node>3217917415, flush queued (flush_count:2)...
2021-11-22 13:03:30.347 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() starting flush timeout: 0.532600 sec ....
2021-11-22 13:03:30.879 9171-9910/com.example.normapp D/protolib: NormSession::OnFlushTimeout() deactivating flush_timer ....
2021-11-22 13:03:30.879 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() node>3217917415, flush queued (flush_count:3)...
2021-11-22 13:03:30.879 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() starting flush timeout: 0.532600 sec ....
2021-11-22 13:03:31.412 9171-9910/com.example.normapp D/protolib: NormSession::OnFlushTimeout() deactivating flush_timer ....
2021-11-22 13:03:31.412 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() node>3217917415, flush queued (flush_count:4)...
2021-11-22 13:03:31.412 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() starting flush timeout: 0.532600 sec ....
2021-11-22 13:03:31.945 9171-9910/com.example.normapp D/protolib: NormSession::OnFlushTimeout() deactivating flush_timer ....
2021-11-22 13:03:31.945 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() node>3217917415, flush queued (flush_count:5)...
2021-11-22 13:03:31.945 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() starting flush timeout: 0.532600 sec ....
2021-11-22 13:03:32.478 9171-9910/com.example.normapp D/protolib: NormSession::OnFlushTimeout() deactivating flush_timer ....
2021-11-22 13:03:32.478 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() node>3217917415, flush queued (flush_count:6)...
2021-11-22 13:03:32.478 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() starting flush timeout: 0.532600 sec ....
2021-11-22 13:03:33.011 9171-9910/com.example.normapp D/protolib: NormSession::OnFlushTimeout() deactivating flush_timer ....
2021-11-22 13:03:33.011 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() node>3217917415, flush queued (flush_count:7)...
2021-11-22 13:03:33.011 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() starting flush timeout: 0.532600 sec ....
2021-11-22 13:03:33.543 9171-9910/com.example.normapp D/protolib: NormSession::OnFlushTimeout() deactivating flush_timer ....
2021-11-22 13:03:33.544 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() node>3217917415, flush queued (flush_count:8)...
2021-11-22 13:03:33.544 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() starting flush timeout: 0.532600 sec ....
2021-11-22 13:03:34.076 9171-9910/com.example.normapp D/protolib: NormSession::OnFlushTimeout() deactivating flush_timer ....
2021-11-22 13:03:34.076 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() node>3217917415, flush queued (flush_count:9)...
2021-11-22 13:03:34.076 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() starting flush timeout: 0.532600 sec ....
2021-11-22 13:03:34.609 9171-9910/com.example.normapp D/protolib: NormSession::OnFlushTimeout() deactivating flush_timer ....
2021-11-22 13:03:34.609 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() node>3217917415, flush queued (flush_count:10)...
2021-11-22 13:03:34.609 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() starting flush timeout: 0.532600 sec ....
2021-11-22 13:03:34.814 9171-9910/com.example.normapp I/protolib: REPORT time>18:03:34.814780 node>3217917415 ***************************************
2021-11-22 13:03:34.814 9171-9910/com.example.normapp I/protolib: Local status:
2021-11-22 13:03:34.814 9171-9910/com.example.normapp I/protolib:    txRate>500000.000 kbps sentRate>    0.232 grtt>0.266300
2021-11-22 13:03:34.814 9171-9910/com.example.normapp I/protolib: ***************************************************************************
2021-11-22 13:03:34.852 9171-9908/com.example.normapp I/protolib: REPORT time>18:03:34.852752 node>3217917415 ***************************************
2021-11-22 13:03:34.852 9171-9908/com.example.normapp I/protolib: ***************************************************************************
2021-11-22 13:03:35.142 9171-9910/com.example.normapp D/protolib: NormSession::OnFlushTimeout() deactivating flush_timer ....
2021-11-22 13:03:35.142 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() node>3217917415, flush queued (flush_count:11)...
2021-11-22 13:03:35.142 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() starting flush timeout: 0.532600 sec ....
2021-11-22 13:03:35.675 9171-9910/com.example.normapp D/protolib: NormSession::OnFlushTimeout() deactivating flush_timer ....
2021-11-22 13:03:35.675 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() node>3217917415, flush queued (flush_count:12)...
2021-11-22 13:03:35.675 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() starting flush timeout: 0.532600 sec ....
2021-11-22 13:03:36.208 9171-9910/com.example.normapp D/protolib: NormSession::OnFlushTimeout() deactivating flush_timer ....
2021-11-22 13:03:36.208 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() node>3217917415, flush queued (flush_count:13)...
2021-11-22 13:03:36.208 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() starting flush timeout: 0.532600 sec ....
2021-11-22 13:03:36.741 9171-9910/com.example.normapp D/protolib: NormSession::OnFlushTimeout() deactivating flush_timer ....
2021-11-22 13:03:36.741 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() node>3217917415, flush queued (flush_count:14)...
2021-11-22 13:03:36.741 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() starting flush timeout: 0.532600 sec ....
2021-11-22 13:03:37.273 9171-9910/com.example.normapp D/protolib: NormSession::OnFlushTimeout() deactivating flush_timer ....
2021-11-22 13:03:37.274 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() node>3217917415, flush queued (flush_count:15)...
2021-11-22 13:03:37.274 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() starting flush timeout: 0.532600 sec ....
2021-11-22 13:03:37.806 9171-9910/com.example.normapp D/protolib: NormSession::OnFlushTimeout() deactivating flush_timer ....
2021-11-22 13:03:37.806 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() node>3217917415, flush queued (flush_count:16)...
2021-11-22 13:03:37.806 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() starting flush timeout: 0.532600 sec ....
2021-11-22 13:03:38.339 9171-9910/com.example.normapp D/protolib: NormSession::OnFlushTimeout() deactivating flush_timer ....
2021-11-22 13:03:38.339 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() node>3217917415, flush queued (flush_count:17)...
2021-11-22 13:03:38.339 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() starting flush timeout: 0.532600 sec ....
2021-11-22 13:03:38.872 9171-9910/com.example.normapp D/protolib: NormSession::OnFlushTimeout() deactivating flush_timer ....
2021-11-22 13:03:38.872 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() node>3217917415, flush queued (flush_count:18)...
2021-11-22 13:03:38.872 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() starting flush timeout: 0.532600 sec ....
2021-11-22 13:03:39.405 9171-9910/com.example.normapp D/protolib: NormSession::OnFlushTimeout() deactivating flush_timer ....
2021-11-22 13:03:39.405 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() node>3217917415, flush queued (flush_count:19)...
2021-11-22 13:03:39.405 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() starting flush timeout: 0.532600 sec ....
2021-11-22 13:03:39.938 9171-9910/com.example.normapp D/protolib: NormSession::OnFlushTimeout() deactivating flush_timer ....
2021-11-22 13:03:39.938 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() node>3217917415, flush queued (flush_count:20)...
2021-11-22 13:03:39.938 9171-9910/com.example.normapp D/protolib: NormSession::SenderQueueFlush() starting flush timeout: 0.532600 sec ....
2021-11-22 13:03:39.940 9171-9910/com.example.normapp V/protolib: NormSession::Serve() node>3217917415 sender flush complete ...
2021-11-22 13:03:40.470 9171-9910/com.example.normapp D/protolib: NormSession::OnFlushTimeout() deactivating flush_timer ....
2021-11-22 13:03:44.815 9171-9910/com.example.normapp I/protolib: REPORT time>18:03:44.815044 node>3217917415 ***************************************
2021-11-22 13:03:44.815 9171-9910/com.example.normapp I/protolib: Local status:
2021-11-22 13:03:44.815 9171-9910/com.example.normapp I/protolib:    txRate>500000.000 kbps sentRate>    0.179 grtt>0.266300
2021-11-22 13:03:44.815 9171-9910/com.example.normapp I/protolib: ***************************************************************************
2021-11-22 13:03:44.853 9171-9908/com.example.normapp I/protolib: REPORT time>18:03:44.853170 node>3217917415 ***************************************
2021-11-22 13:03:44.853 9171-9908/com.example.normapp I/protolib: ***************************************************************************
2021-11-22 13:03:54.815 9171-9910/com.example.normapp I/protolib: REPORT time>18:03:54.815430 node>3217917415 ***************************************
2021-11-22 13:03:54.815 9171-9910/com.example.normapp I/protolib: Local status:
2021-11-22 13:03:54.815 9171-9910/com.example.normapp I/protolib:    txRate>500000.000 kbps sentRate>    0.000 grtt>0.266300
2021-11-22 13:03:54.815 9171-9910/com.example.normapp I/protolib: ***************************************************************************
2021-11-22 13:03:54.853 9171-9908/com.example.normapp I/protolib: REPORT time>18:03:54.853544 node>3217917415 ***************************************
2021-11-22 13:03:54.853 9171-9908/com.example.normapp I/protolib: ***************************************************************************
2021-11-22 13:04:04.815 9171-9910/com.example.normapp I/protolib: REPORT time>18:04:04.815876 node>3217917415 ***************************************
2021-11-22 13:04:04.815 9171-9910/com.example.normapp I/protolib: Local status:
2021-11-22 13:04:04.816 9171-9910/com.example.normapp I/protolib:    txRate>500000.000 kbps sentRate>    0.000 grtt>0.266300
2021-11-22 13:04:04.816 9171-9910/com.example.normapp I/protolib: ***************************************************************************
2021-11-22 13:04:04.853 9171-9908/com.example.normapp I/protolib: REPORT time>18:04:04.853845 node>3217917415 ***************************************
2021-11-22 13:04:04.853 9171-9908/com.example.normapp I/protolib: ***************************************************************************
2021-11-22 13:04:14.816 9171-9910/com.example.normapp I/protolib: REPORT time>18:04:14.816219 node>3217917415 ***************************************
2021-11-22 13:04:14.816 9171-9910/com.example.normapp I/protolib: Local status:
2021-11-22 13:04:14.816 9171-9910/com.example.normapp I/protolib:    txRate>500000.000 kbps sentRate>    0.019 grtt>0.266300
2021-11-22 13:04:14.816 9171-9910/com.example.normapp I/protolib: ***************************************************************************
2021-11-22 13:04:14.854 9171-9908/com.example.normapp I/protolib: REPORT time>18:04:14.854191 node>3217917415 ***************************************
2021-11-22 13:04:14.854 9171-9908/com.example.normapp I/protolib: ***************************************************************************
2021-11-22 13:04:24.816 9171-9910/com.example.normapp I/protolib: REPORT time>18:04:24.816700 node>3217917415 ***************************************
2021-11-22 13:04:24.816 9171-9910/com.example.normapp I/protolib: Local status:
2021-11-22 13:04:24.816 9171-9910/com.example.normapp I/protolib:    txRate>500000.000 kbps sentRate>    0.000 grtt>0.266300
2021-11-22 13:04:24.816 9171-9910/com.example.normapp I/protolib: ***************************************************************************
2021-11-22 13:04:24.858 9171-9908/com.example.normapp I/protolib: REPORT time>18:04:24.858109 node>3217917415 ***************************************
2021-11-22 13:04:24.858 9171-9908/com.example.normapp I/protolib: ***************************************************************************
2021-11-22 13:04:34.817 9171-9910/com.example.normapp I/protolib: REPORT time>18:04:34.817121 node>3217917415 ***************************************
2021-11-22 13:04:34.817 9171-9910/com.example.normapp I/protolib: Local status:
2021-11-22 13:04:34.817 9171-9910/com.example.normapp I/protolib:    txRate>500000.000 kbps sentRate>    0.000 grtt>0.266300
2021-11-22 13:04:34.817 9171-9910/com.example.normapp I/protolib: ***************************************************************************
2021-11-22 13:04:34.858 9171-9908/com.example.normapp I/protolib: REPORT time>18:04:34.858359 node>3217917415 ***************************************
2021-11-22 13:04:34.858 9171-9908/com.example.normapp I/protolib: ***************************************************************************
2021-11-22 13:04:44.817 9171-9910/com.example.normapp I/protolib: REPORT time>18:04:44.817513 node>3217917415 ***************************************
2021-11-22 13:04:44.817 9171-9910/com.example.normapp I/protolib: Local status:
2021-11-22 13:04:44.817 9171-9910/com.example.normapp I/protolib:    txRate>500000.000 kbps sentRate>    0.019 grtt>0.266300
2021-11-22 13:04:44.817 9171-9910/com.example.normapp I/protolib: ***************************************************************************
2021-11-22 13:04:44.858 9171-9908/com.example.normapp I/protolib: REPORT time>18:04:44.858632 node>3217917415 ***************************************
2021-11-22 13:04:44.858 9171-9908/com.example.normapp I/protolib: ***************************************************************************
2021-11-22 13:04:54.817 9171-9910/com.example.normapp I/protolib: REPORT time>18:04:54.817945 node>3217917415 ***************************************
2021-11-22 13:04:54.818 9171-9910/com.example.normapp I/protolib: Local status:
2021-11-22 13:04:54.818 9171-9910/com.example.normapp I/protolib:    txRate>500000.000 kbps sentRate>    0.000 grtt>0.266300
2021-11-22 13:04:54.818 9171-9910/com.example.normapp I/protolib: ***************************************************************************
2021-11-22 13:04:54.858 9171-9908/com.example.normapp I/protolib: REPORT time>18:04:54.858944 node>3217917415 ***************************************
2021-11-22 13:04:54.859 9171-9908/com.example.normapp I/protolib: ***************************************************************************
2021-11-22 13:05:04.818 9171-9910/com.example.normapp I/protolib: REPORT time>18:05:04.818406 node>3217917415 ***************************************
2021-11-22 13:05:04.818 9171-9910/com.example.normapp I/protolib: Local status:
2021-11-22 13:05:04.818 9171-9910/com.example.normapp I/protolib:    txRate>500000.000 kbps sentRate>    0.000 grtt>0.266300
2021-11-22 13:05:04.818 9171-9910/com.example.normapp I/protolib: ***************************************************************************
2021-11-22 13:05:04.859 9171-9908/com.example.normapp I/protolib: REPORT time>18:05:04.859249 node>3217917415 ***************************************
2021-11-22 13:05:04.859 9171-9908/com.example.normapp I/protolib: ***************************************************************************
2021-11-22 13:05:14.818 9171-9910/com.example.normapp I/protolib: REPORT time>18:05:14.818935 node>3217917415 ***************************************
2021-11-22 13:05:14.819 9171-9910/com.example.normapp I/protolib: Local status:
2021-11-22 13:05:14.819 9171-9910/com.example.normapp I/protolib:    txRate>500000.000 kbps sentRate>    0.019 grtt>0.266300
2021-11-22 13:05:14.819 9171-9910/com.example.normapp I/protolib: ***************************************************************************
2021-11-22 13:05:14.864 9171-9908/com.example.normapp I/protolib: REPORT time>18:05:14.864727 node>3217917415 ***************************************
2021-11-22 13:05:14.864 9171-9908/com.example.normapp I/protolib: ***************************************************************************
2021-11-22 13:05:24.820 9171-9910/com.example.normapp I/protolib: REPORT time>18:05:24.820475 node>3217917415 ***************************************
2021-11-22 13:05:24.820 9171-9910/com.example.normapp I/protolib: Local status:
2021-11-22 13:05:24.820 9171-9910/com.example.normapp I/protolib:    txRate>500000.000 kbps sentRate>    0.000 grtt>0.266300
2021-11-22 13:05:24.820 9171-9910/com.example.normapp I/protolib: ***************************************************************************
2021-11-22 13:05:24.866 9171-9908/com.example.normapp I/protolib: REPORT time>18:05:24.866221 node>3217917415 ***************************************
2021-11-22 13:05:24.866 9171-9908/com.example.normapp I/protolib: ***************************************************************************
2021-11-22 13:05:34.821 9171-9910/com.example.normapp I/protolib: REPORT time>18:05:34.820958 node>3217917415 ***************************************
2021-11-22 13:05:34.821 9171-9910/com.example.normapp I/protolib: Local status:
2021-11-22 13:05:34.821 9171-9910/com.example.normapp I/protolib:    txRate>500000.000 kbps sentRate>    0.000 grtt>0.266300
2021-11-22 13:05:34.821 9171-9910/com.example.normapp I/protolib: ***************************************************************************
2021-11-22 13:05:34.866 9171-9908/com.example.normapp I/protolib: REPORT time>18:05:34.866484 node>3217917415 ***************************************
2021-11-22 13:05:34.866 9171-9908/com.example.normapp I/protolib: ***************************************************************************
agrossmann09 commented 2 years ago

android_mac_norm_pcap.zip

Here is a zip file of the pcap I captured on my Mac of both sending and receiving messages.

agrossmann09 commented 2 years ago

I have also read some forums that say Android IPv4 Multicast can be buggy without using a MulticastSocket or obtaining a Multicast Lock before performing some operations. I tried some IPv6 addresses but norm is failing with

Proto Info: ProtoDebug>SetDebugLevel: debug level changed from 1 to 12
Proto Warn: ProtoSocket::EnableRecvDstAddr() setsocktopt(IP_RECVDSTADDR) error: Invalid argument
Proto Error: ProtoSocket::SetFragmentation() setsockopt(IP_DONTFRAG) error: Invalid argument
Proto Warn: NormSession::Open() warning: tx_socket.SetFragmentation() error
Proto Error: ProtoSocket:JoinGroup() setsockopt(add membership) error: Can't assign requested address
Proto Fatal: NormSession::Open() rx_socket.JoinGroup error
Proto Error: ProtoSocket::LeaveGroup() error leaving multicast group: Can't assign requested address
Proto Warn: ProtoSocket::EnableRecvDstAddr() setsocktopt(IP_RECVDSTADDR) error: Invalid argument
Proto Error: ProtoSocket::SetFragmentation() setsockopt(IP_DONTFRAG) error: Invalid argument
Proto Warn: NormSession::Open() warning: tx_socket.SetFragmentation() error
Proto Error: ProtoSocket:JoinGroup() setsockopt(add membership) error: Can't assign requested address
Proto Fatal: NormSession::Open() rx_socket.JoinGroup error
Proto Error: ProtoSocket::LeaveGroup() error leaving multicast group: Can't assign requested address
java.io.IOException: Failed to start receiver
agrossmann09 commented 2 years ago

Going back to the transfer speeds, even if I set the txRate to my network bandwidth max, the file transfer rate is 45 kbps. Correct me if I am wrong but it seems the txRate only affects the transfer of the bytes for the normTemp cached file. But the time it takes to create the object complete event still stalls at 45 kbps.

bebopagogo commented 2 years ago

What size of files are you sending and what is your network bandwidth max (and what is the RTT and possibly packet loss rate, if any)?   I still need to analyze the trace files you sent to me, but I could also recreate your test case. 

I think I mentioned previously if you use fixed transmit rate, you need to back off a little bit on the transmit rate to allow for lower layer protocol overhead (including IP, Ethernet or whatever link layer framing protocol you are using).  If you are using a NORM segment size that is relatively large (e.g., over 1000 bytes), configuring your NORM transmit rate to 90-94% of your link rate is probably safe.  Otherwise, if you overbook your link rate, you cause extraneous packet loss, and NACK/repair transmission cycles.

From: Alessandro Grossmann @.> Reply-To: USNavalResearchLaboratory/norm @.> Date: Wednesday, November 24, 2021 at 12:30 PM To: USNavalResearchLaboratory/norm @.> Cc: Brian Adamson @.>, Comment @.***> Subject: Re: [USNavalResearchLaboratory/norm] Unable to send 10mb file (Issue #45)

Going back to the transfer speeds, even if I set the txRate to my network bandwidth max, the file transfer rate is 45 kbps. Correct me if I am wrong but it seems the txRate only affects the transfer of the bytes for the normTemp cached file. But the time it takes to create the object complete event still stalls at 45 kbps.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

agrossmann09 commented 2 years ago

I am trying to send a 10mb file. My current network bandwidth is about 150 Mbps Upload and 15mbs Download. My TxRate varies from 250 Mbps to 1 Gbps. Then running trace route to find my RTT is get the following output 29.479 ms 29.776 ms 37.503 ms. I am current using session.setTxRate(384000) ~ this is my test env, and using a segment size of 1024.

agrossmann09 commented 2 years ago

I tested sending the file as a byte buffer using enqueueData and that seemed to improve speed to about 60 seconds for a 10mb file.

agrossmann09 commented 2 years ago

Another issue when sending a large file ~ 100MB is that I get a message_pool is empty warning/fatal. This then prevents me from sending another large object. Is this an issue to due with the buffer space?

bebopagogo commented 2 years ago

When you set the tx rate to 384000, that is 384 kbps. For a 10mb file, that should take about 8101024*1024 bits / 384000 bits/sec or about 219 seconds. With your much higher speed connections, that should not be a problem. I haven't had a chance to look at your trace file you posted above and I will try to do that (I took some time off during the holiday period). Depending on how you use the NORM API, you enforce different types of flow control. By default, NORM uses a timer-based flow control based on the measured round trip timing where if it does not see any NACKs from the receiver(s) for a given older object that was previously enqueued, it will accept a newly enqueued object and purge that older object (the NORM buffering cache API lets you limit the amount of memory used for maintaining state on enqueued objects). That only happens when the tx cache is filled. So, if you are sending single file, that should not be an issue either. If you are able to enable the debug like you did above, that may provide some insight. The "trace" option is useful to validate that the sender is in fact receiving NACK/ACK messages from the receivers and the receivers are hearing the sender. I.e., to verify there are no routing or firewall issues interfering with NORM protocol operation.

bebopagogo commented 2 years ago

FYI - from the debug output you posted above, it looks like your Android is not hearing the other senders. It's "REPORT" output would have a stanza for each "Remote sender" heard like the Mac debug output shows. The PCAP had NORM protocol messages (NORM_CMD(CC), NORM_CMD(FLUSH), etc) but only a single NORM_DATA packet. That would be the case if each node sent an object that was small and fit within in a single NORM segment (single packet payload). I have a "pcap2norm" utility that outputs something like the NORM debug "trace" output from the pcap data and here are the DATA messages seen in that PCAP zip you posted above:

trace>18:27:32.952916 src>192.168.0.5/37714 dst>224.1.2.3/4000 id>0x9d2ca54e inst>1794 seq>8 DATA obj>0 blk>0 seg>0000 len>49 trace>18:27:40.448449 src>192.168.0.6/63799 dst>224.1.2.3/4004 id>0xa8a9cc5a inst>61999 seq>25 DATA obj>0 blk>0 seg>0000 len>47