Open michaelwgnr opened 7 years ago
I would be glad if someone could answer this!
Thanks!
Hi, sorry for not responding earlier.
Your observations on the check_interval()
function are correct, this function is called whenever there's any update to a trickle instance. This is to avoid having to run a separate timer for each trickle instance: Instead of executing rule 2 at the exact time of the interval expiry, we do it whenever we do the first operation on a trickle instance after the timer has expired. This saves us some ~20 bytes of memory per trickle instance and a lot of CPU cycles, but obfuscates the procedure some. So yes, check_interval()
executes rule 2, but only after the interval has expired.
I think the confusion comes from differing interpretations of the Trickle RFC. In the OpenMesh, we want to broadcast values forever (but at exponentially increasing intervals). This is to let devices that enter the network while it's running to come up to speed on the state of the different handles. We interpret the C value as a per-transmit value, as rule 2 resets it every interval. As mentioned, this reset is done at the first opportunity after the interval has expired.
Please let me know if I misunderstood any parts of your question, or if there still are problems with the execution of this procedure. I'll work on my response time! :smiley:
Hi Trond
Thanks for the detailed response. I'm still a bit unclear on the details here. If I understand you correctly, there are two issues at play here. I think I understand the first one alright, but am unsure about the second:
check_interval()
happens any time there is an update to a trickle instance (i.e. any trickle instance becomes "inconsistent"). This is a deliberate design choice to save memory and CPU cycles because it reduces the number of timers that need to be running.Is this right?
I assume this leads to potentially heavily increased traffic, though: Suppose there are more active handles in the mesh than there is space in the handle cache:
Did you experience anything like that? Or do you generally suggest to use the mesh with a low number of handles or rather a handle cache sized to hold all the handles?
Sorry for grinding on this, but it seems for most of my potential use cases, it would be preferable to adhere to trickle as I interpreted it and stop rebroadcasting as soon as a certain consistency is reached, so new values will be quickly and "reliably" flooded through the network, but the network remains idle for most of the time, no matter how many handles are in use. Therefore I wonder if the behaviour would be modifiable to still have the memory and CPU savings, but only reset C when the affected handle receives an update?
If we add this to your first point, I agree:
The execution of rule 2 in check_interval() happens any time there is an update to a trickle instance, if the interval has expired..
The second point is correct, and so is your observation on the increased traffic. If a handle keeps dropping out of the cache between repeats from a neighbor, the device will start transmitting it again and again. This would balance out a bit by the fact that the cache removes entries on a least recently used basis, but it doesn't remove the problem all-together.
I think @victorpasse wanted similar behavior as you about a year ago, and I gave him this answer. Could this approach work for your use cases?
Hi there,
I experimented a bit with openMesh and also re-read the trickle RFC several times and I stumbled over something:
According to RFC 6206 - "The Trickle Algorithm", Section 4.2 (emphasis mine):
So my understanding of the interplay between counter c and consistency constant k is that as soon as a message is "consistent enough", transmission of the message should stop. The counter will only be reset upon receiving an inconsistent transmission.
When working with open mesh I observed that the nodes continue with their transmissions infinitely. Looking at the problem more thoroughly, I realized that the counter will be reset in the
check_interval
function intrickle.c
, whenever the time of the call is equal to or greater thanI
(ortrickle->i
in the code).check_interval
is called on reception of a consistent(!) transmission or - basicallay - whenevervh_order_update
inversion_handler.c
is called; which seems to be about every time any handle is received or needs to be transmitted (because of new value or timeout).So to me it looks like the openMesh implementation of trickle does not adhere to the algorithm, because it resets the counter
c
at other times than specified in the algorithm. Point 6 explicitly goes back to step 2, which is the only time I understand it can happen thatc
is reset.Is there a specific reason, why the counter is reset at these times or is this a bug?
Thanks for any help on this.