bytebeamio / rumqtt

The MQTT ecosystem in rust
Apache License 2.0
1.65k stars 255 forks source link

Store outgoing_pub in a HashMap for more memory efficient storage of unassigned pkids #880

Open ambaxter opened 5 months ago

ambaxter commented 5 months ago

Type of change

Bug fix (non-breaking change which fixes an issue)

Checklist:

I wrote an MQTT traffic simulator and noticed that creating many clients within a single process consumes a large amount of memory. The below example is for 1000 clients.

mqtt_before

This is due to rumqttc pre-allocating MqttState.outgoing_pub for each client. I replaced the Vec with a HashMap and dramatically reduced memory usage.

mqtt_after

I know that MqttState.outgoing_pub is most likely pre-allocated for low powered devices. Would it be possible to make using a HashMap either a runtime configuration or feature flag?

de-sh commented 5 months ago

By any chance does the code with which you tested these scenarios use QoS 1+?

Seems like the HashMap is saving space as it never gets utilized(which is the case in QoS 0). That shouldn't be the case when we end up using QoS 1+ where allocations will take up some cycles as the map size increases slowly.

swanandx commented 5 months ago

you can also configure how much you want to preallocate using MqttOptions, let's say you set max_inflight to be 10 to save memory, it won't allocate more than that.

ambaxter commented 5 months ago

By any chance does the code with which you tested use QoS 1+?

Seems like the HashMap is saving space as it never gets utilized(which is the case in QoS 0). That shouldn't be the case when we end up using QoS 1+ where allocations will take up some cycles as the map size increases slowly.

The simulation I've developed only uses QoS 1 &2. The HashMap should only be as large as the number of Publish entries waiting on PubRec or PubAck. Each HashMap could very well grow to be u16::MAX in length.

My test-case is currently running at 10,000 MQTT clients at ~1500 messages/s for the entire simulation. The simulator's RAM hasn't breached 800 MB in almost an hour of processing.

ambaxter commented 5 months ago

you can also configure how much you want to preallocate using MqttOptions, let's say you set max_inflight to be 10 to save memory, it won't allocate more than that.

True, but in many cases you won't know what the correct value should be just based off of traffic size alone. There are many variables to consider.

xiaocq2001 commented 4 months ago

IMHO we can use VecDeque to save pkid/publish and NoticeTx, the reasons:

  1. Since the packets are always ordered, the push_back and pop_front can be used, they work faster than insert and push of Vec.
  2. The memory usage is less than HashMap

Some of the proposed changes are included here: https://github.com/bytebeamio/rumqtt/commit/68fcb73b1d736449626e39d5528d74dbad82287e#diff-35dca8779a1d08a58f298e7f88a1ca9d2c32f79e97024252a6366e4a2f32325a