wandnz / streamevmon

Framework and pipeline for time series anomaly detection
GNU General Public License v3.0
1 stars 1 forks source link

AliasResolver memory usage grows unboundedly #53

Open danoost opened 3 years ago

danoost commented 3 years ago

The AliasReslover includes a mergedHosts field which keeps track of all the host mappings it has seen in its history. By design, it currently has no way of removing elements that it's seen before. This is so that hosts that are seen once again after not being seen for quite a while are resolved correctly.

I'm currently working on decoupling the alias resolution from any class that holds the full state of the graph, so it's not trivial to just prune old mergedHosts contents when hosts are fully removed from the graph. It could be possible to have a feedback loop from the later pruning step. This would allow us to intercept vertex removal events and keep the content mergedHosts reasonably in line with what the graph currently holds, but it does have the potential to lose useful alias data.

It's likely that we'll have to find a tradeoff, such as dropping host mappings after a certain amount of inactivity (in time or measurement count).

danoost commented 3 years ago

We could perhaps change how we use CheckpointedFunction with the AliasResolver to take advantage of Keyed State TTLs, so that the timeout behaviour is handled directly by Flink.