Open LiPengze97 opened 3 years ago
Taking a deep view into the source code, in NonPersistentReplicator.java, if the netty connection is not writable, the non-persistent messages are just dropped silently. I think the connection is not writable when the load is too heavy for it, am I right? Is 'dropping the non-persistent messages' the common design for the non-persistent topics in the pub-sub system when the load on the connection is heavy?
yes, non-persistent topic doesn't persist messages so, it doesn't give message delivery guarantee. it's used for usecases which only care about the latest data and don't care about past data for example live video streaming.
We use a lot of high-throughput non-persistent topics on our system for streaming telemetry, it's useful for use cases where the overhead of tracking message acknowledgement and redelivery is just not worth it.
We use a lot of high-throughput non-persistent topics on our system for streaming telemetry, it's useful for use cases where the overhead of tracking message acknowledgement and redelivery is just not worth it.
- If you can spare the CPU cycles compressing the data will dramatically reduce the bits on the wire (obviously that's the point) which may be useful if the latency is high.
- You mention using Ubuntu... Most linux distributions have absolutely terrible default TCP settings and you have to change them if you want to get reasonable high-throughput performance. There are dozens of kernel settings but the ones I always end up changing are the rx/txqueuelen, net.core.rmem_max & net.core.wmem_max, net.ipv4.tcp_rmem & net.ipv4.tcp_wmem, ec..
Thanks for your useful suggestions, I will try it!
yes, non-persistent topic doesn't persist messages so, it doesn't give message delivery guarantee. it's used for usecases which only care about the latest data and don't care about past data for example live video streaming.
Get confirmed from the author of the pulsar is indeed useful, thanks a lot!
The issue had no activity for 30 days, mark with Stale label.
Describe the bug I set up five single-broker Pulsar clusters in five regions, Clemson, Wisconsin, Massachusetts, and two in Utah (Utah1, Utah2). The publisher is in Utah1, and the subscribers are in each of the rest clusters. For my experiment purpose, I prefer non-persistent topics. I want to use the geo-replication of Pulsar, and I found that when the sending throughput is high (like 1k message/sec, message size 10k, totally 10k message), only the subscriber in Utah2, which has the highest bandwidth from the Utah1, can receive all the published messages (not always, some times it also receives only a part). For the rest of the subscribers in other clusters, they are only able to receive part of the messages. I use Java client at first. I got the bug and I thought it was my code's fault. So I use the perf tool, and the problem is still there.(see picture below, the records received are all not 10000, but the producer says
08:25:20.996 [Thread-1] INFO org.apache.pulsar.testclient.PerformanceProducer - Aggregated throughput stats --- 10000 records sent --- 639.113 msg/s --- 49.931 Mbit/s
)I set the parameters below in broker.conf to 10000(higher or equal to the total number of message than I want to publish)
and I also set the
receiverQueueSize(10000)
in my Consumer client code.I think this is not a bug, but a feature, because the sending bandwidth is much higher than the bandwidth on WAN. In Issue 451, there seems some discussion about the sending rate of the publisher. But I don't know what sending rate is reasonable under the WAN environment. Could you offer me some suggestions on what the sending bandwidth should be?
The network information is:
bandwidth(Mbyte/s)
latency(ms)
To Reproduce Steps to reproduce the behavior:
bin/pulsar-perf produce non-persistent://my-tenant/my-namespace/wa -bm 0 -s 10240 -r 1000 -m 10000
at publisherbin/pulsar-perf consume non-persistent://my-tenant/my-namespace/wa
at subscriberExpected behavior All the non-persistent messages are received correctly.
Screenshots
Desktop (please complete the following information):