Closed raltnoeder closed 5 years ago
This looks good, but it made me thing about the implementation a bit more.
The way we do it now, the collectors are responsible for communicating at what age events are stale. I think this is still good, as it that information is going to depend really heavily on the way they were collected, and these are what understand that best.
The exact number of polling cycles isn't actually super important, but I was wondering if perhaps keeping a short (well, maybe not short even, but not unlimited) history of the timestamps at the end of a polling cycle and then giving the prune time as being older than the timestamp at pollingEndTimestampHistory[3] or some such. That should scale with both long poll times and increased polling interval times.
We'd have to populate the history with some bogus future dates (Now() + year???) to start or gracefully handle the short period of time when that history isn't populated with initialized data, and it does introduce more state to this collector in particular, but not in too egregious of a way and we do already have FIFO queue type stuff: https://github.com/LINBIT/drbdtop/blob/master/pkg/resource/resource.go#L222
If that doesn't seem insane, do you think you could try that and see what the results are like?
That sounded like a good idea. I opened another pull request. The fix keeps a history of the last 3 poll cycle's timestamp (still the poll cycle's start time though, it doesn't make much of a difference for the history, but it guarantees that even if the time stamps are a little weird for whatever reason (e.g. SIGSTOP or whatever), all resources seen in the current poll cycle are save from being pruned), so all resources that were seen in the 3 poll cycles before the current poll cycle will have time stamps that are newer than the one used as a time reference for executing prune(). From the tests I have done so far it seems to work as expected.
See pull request #15, which is supposed to replace this one
The proposed fix prunes entries once after each poll cycle, based on the timestamp of the earliest event generated by that poll cycle, or a drbdtop-internal timestamp if no events were received from drbdsetup. (Some more details in the commit message)
It's not a 100% fix, because pruning was apparently supposed to happen if the last ~3 poll cycles did not update the resource, and if poll cycles take longer than the poll interval, then it might still remove entries that were updated within the last 3 poll cycles, but it will not remove entries that were updated in the current poll cycle, so all currently active/configured resources will be displayed, even if the poll cycle takes much longer than the configured poll interval.