shunwang / openpgm

Automatically exported from code.google.com/p/openpgm
0 stars 0 forks source link

Bad performance of sending RDATA on PGM_NOBLOCKING mode #17

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1.to start 10 process on machine A to subscribe message published from machine B
2.machine B publish message quickly, for example (size: 2000bytes, count: 
10000), with high message rate delivery(700Mb/s)
3.unrecoverable data loss occurs every time run the test

What is the expected output? What do you see instead?
I expect that data loss will be repaired by retransmission by sender finally, 
and behaviour seen from trace log is weried, retry count up to 50(hardcoding by 
0mq) and retransmission was canncelled.

What version of the product are you using? On what operating system?
openPGM packaged in zeromq-2.1.11, linux 2.6.18-194.8.1.el5

Please provide any additional information below.
Refers to sender's log(log level: DEBUG), pgm_on_deferred_nak() was only be 
called one time when new message(NAK) arrived in pgm_recvmsgv(), why not flush 
all RDATA like wait_for_event() in blocking mode? If there is a lot of NAKs, 
sender will have little chance to do repair work. 

Original issue reported on code.google.com by stepinto...@gmail.com on 26 Apr 2012 at 3:20

GoogleCodeExporter commented 9 years ago
It's log file of sender. 

Original comment by stepinto...@gmail.com on 26 Apr 2012 at 3:22

Attachments:

GoogleCodeExporter commented 9 years ago
It is by design that repairs are deferred.  There are two leads, one is that 
for some configurations it is required that original data continues on in the 
presence of significant failures, second is that with jumbograms repairs become 
very expensive and cause notable impact to original data delivery.

At high message rates everything tends to fail, the conclusion is the 
application has to provide some form of coarse grained throttling to permit the 
environment resources for repair time and transmission.

Note that the current configuration of PGM with 0MQ provides low latency 
delivery and TCP fairness, it is not hard tuned to maximum throughput.

I'm working with another multicast transport on Windows that yields 
significantly higher throughput performance via pushing very large 
multi-fragment packets but at a very low rate.  So PGM manages ~14,000 packets 
per second at say ~80mb/s but protocol X manages ~700mb/s with only ~1,000 
packets per second.

TODO:  inspect log.

-- 
Steve-o

Original comment by fnjo...@gmail.com on 26 Apr 2012 at 2:02

GoogleCodeExporter commented 9 years ago

Hi Steve, thanks a lot for your reply!

Furtherly, to "provide some form of coarse grained throttling to permit the 
environment resources for repair time and transmission", do you think it's a 
good idea in the situation of lots of receivers runs on same machine that using 
a daemon like tibrv-rvd, which receive packet by openPGM and feed subscribers 
with IPC, because I think muliti-receiver will incur heavy IO/CPU overhead 
which cause data loss, and by my test, unrecoverable data loss never happened 
in single receiver mode.

Another question is, refer to RFC3208, receivers may optionally multicast a NAK 
with TTL of 1 to the local group for data packets was missing, if sender is 
busy with ODATA/RDATA/NCF, perhaps receivers multicast NAK can share some 
burden of NAK suppression with sender.

Attachment is log of reciever1 (there are 10 receivers).

Original comment by stepinto...@gmail.com on 27 Apr 2012 at 11:51

Attachments:

GoogleCodeExporter commented 9 years ago
The TIBCO Rvd or ciServer, etc, approach is great for propagation of client 
disconnects, it can also be efficient for packet fan-out with an appropriate 
high speed user-space IPC method.  The trade off is the cost of one application 
running socket calls and receiving kernel switch overheads compared with every 
application being hit by kernel switching.  For many scenarios is it quite 
surprising how minimal the difference is.

There is to note a limitation with the PGM protocol that whilst you can have 
multiple receivers on one host you cannot have out-of-the-box senders working 
in the same configuration.  This is when a broker application is necessary to 
manage incoming NAK requests.

Note the multicast NAKs are for routed environments to accelerate NAK 
supression.  Within a LAN segment the multicast of NCFs by the original sender 
performs the same role.  This is another reason for the deferred RDATA 
transmission.  When a NAK is received a NCF is immediately returned and the 
RDATA is queued, this enables the network to perform NAK supression and the 
receiver to perform NAK elimination to improve the value of the subsequent 
RDATA broadcast.

-- 
Steve-o

Original comment by fnjo...@gmail.com on 27 Apr 2012 at 1:52