Setting configuration parameters for Sender and Receiver side on OpenPGM

GoogleCodeExporter commented 9 years ago

I am currently doing some performance tests for OpenPGM. I now have a problem 
which I think arises from wrong usage of setting parameters. In the current 
case I create packets for 2 megabytes of data (1500 bytes each) and send these 
packets. On the receiver side receiver stops receiving and quits with 
IO_STATUS_RESET after receiving approximately 300K. This problem seems to be a 
buffer problem apparently.

I don’t know if I’m missing a point in setting the parameters that arrange 
buffer space or any other point. It will be very appreciated if you provide a 
sample collection of parameters that I need to set in receiver side and in 
sender side in order to overcome this problem.  The parameters that I’m 
currently setting are following:

<pgm_multicast_receiver>

      <multicast_address>eth0;239.192.0.1</multicast_address>

      <port>7500</port>

      <udp_encap_port>false</udp_encap_port>

      <max_tpdu>1500</max_tpdu>

      <sqns>100000</sqns>

      <multicast_loop>1</multicast_loop>

      <multicast_hops>16</multicast_hops>

      <no_router_assist>0</no_router_assist>

      <recv_only>1</recv_only>

      <passive>1</passive>

      <peer_expiry_secs>300</peer_expiry_secs>

      <spmr_expiry_msecs>250</spmr_expiry_msecs>

      <nak_bo_ivl_msecs>50</nak_bo_ivl_msecs>

      <nak_rpt_ivl_msecs>2000</nak_rpt_ivl_msecs>

      <nak_rdata_ivl_msecs>2000</nak_rdata_ivl_msecs>

      <nak_data_retries>50</nak_data_retries>

      <nak_ncf_retries>50</nak_ncf_retries>

    </pgm_multicast_receiver>

    <pgm_multicast_sender>

      <multicast_address>eth0;239.192.0.1</multicast_address>

      <port>7500</port>

      <max_tpdu>1500</max_tpdu>

      <sqns>100</sqns>

      <odata_max_rte>500000</odata_max_rte>

      <txw_max_rte>400000</txw_max_rte>

      <enablefec>0</enablefec>

      <fecinfo_block_size>255</fecinfo_block_size>

      <fecinfo_proactive_packets>0</fecinfo_proactive_packets>

      <fecinfo_group_size>8</fecinfo_group_size>

      <fecinfo_ondemand_parity_enabled>1</fecinfo_ondemand_parity_enabled>

      <fecinfo_var_pktlen_enabled>1</fecinfo_var_pktlen_enabled>

      <multicast_loop>1</multicast_loop>

      <multicast_hops>16</multicast_hops>

    </pgm_multicast_sender>

Original issue reported on code.google.com by yavuze...@gmail.com on 10 Oct 2011 at 1:14

GoogleCodeExporter commented 9 years ago

Is there a rate limit?

Generally on non-perfect networks you will drop packets and start the 
reliability process.  If you are publishing at full speed then there is no or 
limited capacity for recovery to occur.  It is recommended to set the rate 
limit, at least for ODATA to a percentage below full capacity.  Testing and 
network analysis can derive ideal parameters.  You may wish to start around 60% 
channel capacity.

With OpenPGM 5 you have a choice of different rate limiters:

* PGM_ODATA_MAX_RTE - Limit original data packets.
* PGM_RDATA_MAX_RTE - Limit repair data packets.
* PGM_TXW_MAX_RTE   - Limit total transmit data capacity.

The following parameters may also be helpful:

* PGM_UNCONTROLLED_ODATA - No limit on original data.
* PGM_UNCONTROLLED_RDATA - No limit on repair data.

Next point to consider is whether the sender remains active after publishing 
its stream of packets.  In order for recovery to function the sender needs to 
remain alive processing any incoming repair requests until sufficient time has 
passed to reasonably assure all recipients have received contiguous correct 
data.

Original comment by fnjo...@gmail.com on 10 Oct 2011 at 4:06

Changed state: Accepted

GoogleCodeExporter commented 9 years ago

* PGM_ODATA_MAX_RTE - Limit original data packets.
* PGM_RDATA_MAX_RTE - Limit repair data packets.
* PGM_TXW_MAX_RTE   - Limit total transmit data capacity.

How can we calculate the transfer rate (in megabytes/second) based on these 
parameters?

What is the effect of the sqns parameter on the receiver side?

Original comment by yavuze...@gmail.com on 12 Oct 2011 at 8:45

GoogleCodeExporter commented 9 years ago

The rates are in bits per second at the IP layer, i.e. excluding Ethernet 
preamble & inter-frame gaps.  So 16*1000*1000 would be 16mbit/s.

Honestly just start at ODATA_MAX_RATE at 60*1000*1000* and start tweaking if 
required.  For example when witnessing too many repair data packets on the wire 
you may need to add a separate RDATA limit.

Defining the size of the transmit and receive windows is quite complicated but 
of little consequence.  If the windows are too large you end up wasting a tiny 
amount of memory and add minor overhead to high speed delivery due to cache 
churn.  If the windows are too small you end up with the sockets resetting too 
frequently because the original data is no longer available.

Consider TIBCO's Rendezvous specifies the window size in seconds, mainly for 
historical reasons because on 10mb networks you cannot send that much data.  
For high throughput gigabit and beyond networks a second might be just too 
large and impractical due the latencies on recovery that a configuration would 
result in.

Original comment by fnjo...@gmail.com on 12 Oct 2011 at 8:56

GoogleCodeExporter commented 9 years ago

Hi there, 
I got a very similar situation as this. 
Even after I set up a Gigabit private network between two computers by 
connecting them to the same switch, I still got package loss in the Wireshark 
log level. 
I also can see many NAK message sent from the receiver and arrived at the 
sender, but they seem never been processed. 

From the Performance section of project home, I have seen that it is stated as 
below:
"Testing has shown on gigabit Ethernet direct connection speeds of 675mb/s can 
be obtained with 700mb/s rate limit and 1,500 byte frames. The 
packet-per-second rate can be increased by reducing the frame size, performance 
will be limited, the actual numbers should be verified with a performance test 
tool such as iperf. " 
This is a very attractive output. Could you please maybe give me a reference of 
your sending/receiving parameters?
Following is mine: 
const int PGM_BUFFER_SIZE = 20 * 1024;
const std::string PGM_MULTICAST_ADDRESS = ";224.0.12.136";
const bool USE_UDP_ENCAP_PORT = false;
const int MAX_RTE = 0;
const int RS_K = 0;
const int RS_N = 0;
const int MAX_TPDU = 1500;
const int SQNS = 100;
const int USE_MULTICAST_LOOP = 0;
const int MULTICAST_HOPS = 16;
const int NO_ROUTER_ASSIST = 0;
const int MAX_ODATA_RTE = 1*1000*1000; // mbits
const int DSCP = 0x2e << 2;

// sender only
const int SEND_ONLY = 1;
const int SENDER_NON_BLOCKING = 0;
const int AMBIENT_SPM = pgm_secs(30);

// receiver only 
const int RECEIVE_ONLY = 1;
const int PASSIVE = 0;
const int PEER_EXPIRY_SECS = pgm_secs(300);
const int SPMR_EXPIRY_MSECS = pgm_msecs(250);
const int NAK_BO_IVL_MSECS = pgm_msecs(50);
const int NAK_RPT_IVL_SECS = pgm_secs(2);
const int NAK_RDATA_IVL_SECS = pgm_secs(2);
const int NAK_DATA_RETRIES = 50;
const int NAK_NCF_RETRIES = 50;
const int RECEIVER_NON_BLOCKING = 1;

Thanks in advance.

Original comment by who9...@gmail.com on 16 Jul 2012 at 8:17

GoogleCodeExporter commented 9 years ago

[deleted comment]

GoogleCodeExporter commented 9 years ago

By the way, despite many NAK messages, I did not see any RDATA package from 
Wireshark log. So I guess there might be something missing in my parameter 
settings as I am new in OpenPGM, multicast, as well as network programming. 
Could you guys please help me out?
Here I attach my wireshark logs, the data is sent between the message "start" 
and "end".

Original comment by who9...@gmail.com on 16 Jul 2012 at 8:55

Attachments:

[1MB Odata (2).zip](https://storage.googleapis.com/google-code-attachments/openpgm/issue-10/comment-6/1MB Odata %282%29.zip)

km4arr / openpgm

Setting configuration parameters for Sender and Receiver side on OpenPGM #10