Open petersilva opened 5 years ago
really don't want this to be big... I'll say 1000 bytes.
I think that 1000 bytes must be enough for everybody. On the other hand, if an institution will publish too big messages then none will subscribe to them. At the end such institution will harm itself because clients will poll the directory tree (and generate unnecessary load on the server). So the size will organically self-regulate :-)
This is implemented in the wmo_mesh example now. --inline option, with --inline_max to do experiments with maximim inline message size.
self-regulation idea is a good one. On one hand, including the data in the payload saves time for small bulletins. On the other hand, if one is
subscribing to two sources for all products, then one will only be using the products that come from the first one, and all inlined data that does not arrive first is wasted (would not have been downloaded if it were not inlined.)
server side filtering possible with MQTT (or AMQP) is fairly coarse, and one must, in general request more messages than one genuinely intends to download. These other messages are filtered out by client side reject clauses. so how many messages are downloaded, only to be rejected on the client side.
inlining worsens performance in a LAN where the roundtrip time is negligeable, the optimization is negligeable, likely drowned out by the reduced message processing rate. In the LAN case using AMQP one wants to spread the requests out to many instances, which is done more quickly without inlining. in Sarracenia, SFTP sessions are maintained, so while there is a round trip for the get request, one does not pay connection establishment on each transfer.
on the current feed from hpfx.collab, I upped the maximum to 2048, to get more files inlined, provides more frequent demonstration.
@davidpodeur brought up an interesting case:
one would need fify or more parallel transfers to catch up with tar files. In this instance, a much higher limit for the size of embedded data makes sense, or an extended message type that refers to a tar bucket.
Well, we definitely cannot set one hard limit to fit all use-cases. It needs to remain configurable. We can just recommend - something like: "Keep it in kilobytes unless you are sure that your use case will benefit from a higher limit. Avoid going to megabytes unless your data is distributed only to a restricted group of systems that can cope with it."
This is a reference to #3 ... separate from implementation concerns, inlining large data will have a severe effect on broker performance. so this Issue will try to document a consensus value.